Forecasts 2011: Linked Open Data for MuseoTorino. Best wishes!

Much of the really useful information has always been hidden in various systems around the world, and even the largest institutes have been reluctant to share their results. Most of the public data was not made available under license with redistribution permission. This is the past.

We come to the present: in several parts of the globe we have witnessed the data release actions of some activists who have become, in fact,  Robin Hood of open data !

Yet, the institutions fail to keep up with the times: they are five, ten years behind what new technologies make possible, but somebody is winning the battle.

And what does the Future mean for us? In Italy we do not need a compromise, but one full switch at 180 ° degrees!
We need to free up data and information because these vast clouds of free information contain facts that look like saette!
An attempt: to use statistics and display data to explain the whole world, or what this boots, for two centuries to this part to see the future!

I was very impressed with the words of David Eaves, a Canadian activist of the open date:
” Just as libraries were not built for people who were already literate, open data portals are not just for a small elite of hackers and policy wonks “. When the western world was busy building libraries in the 19th and early 20th centuries, they were built on the belief that they would act as hubs to help citizens become literate and in doing so benefit society as a whole.
In the world of international development, opening up data and building portals that offer convenient access for users may seem like nerdy technical endeavors now, but these are the first steps towards a more effective data-literate development sector. Fundraisers, researchers, policy experts, administrators, consultants, field workers, local staff, community activists and individuals who are directly affected by the aid will benefit from better access to information they can use. 

“Just as libraries were not built for those who already knew how to read and write, so open data is not intended for a small community of hackers or political experts. … ”
That’s what the libraries meant: they had been thought of as a hub for cultural growth for the benefit of the entire society … and not just part of it.

In the world of international development, data openness and portal building offer users easy access to important and structurally related information: all this knows a lot of technology now but these are the first steps in the data development industry free in the field of culture.
Perhaps you would prefer that I express myself not in terms of “technical” but as a auspice of magic arts, after appropriate consultation … and let’s run the details on the latter, show you the “Fututo”?

So here’s to you my prediction of 2011: MuseoTorino as an example of LOD technology of semantic search engine!

I have the pleasure of presenting an abstract of the press release that Gian Luca Farina Perseu , designer of all the computer part of Torino Museum that will be in line in March for the 150th anniversary of the Italian unity, told me about:

The MuseoTorino System is configured as a web application accessible by any user, without the need for authorization, through an internet browsing program (web browser). Using the web browser, from March 2011 users will be able to access MuseoTorino and navigate within the collection through a triple interaction mode: they can start from the present city map to display each museum’s card inserted into the system, with its spatial location georeferenced; will explore the cities of the past through an exhibition organized by chronological and thematic floors, which groups a predefined and selected set of tabs; they can finally consult the catalog of the Museum, using an intelligent search function that will search for the desired information and related objects through different filtering and clustering criteria. It will also be possible to consult the volumes in the Digital Library and the materials stored in the Media Library, for which a Digital Asset Management (DAM) platform is planned to store and organize different types of digital resources (images, video, audio, pdf). At a stage after the inauguration, the system will be able to manage the creation or modification of content from the internet users community, according to the philosophy behind sites such as Wikipedia. Each change can be viewed by other users who may also, optionally, report inaccuracies or correct the contents of the cards under the supervision – never intrusive or inhibition – of the experts of Museo Torino. In this way MuseoTorino can become a memory container, information and knowledge about the city more and more fully, thanks to the contributions of users. Anyone who accesses the virtual museum will, however, always have the knowledge of accessing the correct content, as certified by Museo Torinese.
Designed according to the Open Data philosophy, the MuseoTorino System is based on innovative information sharing technologies with the user and with other systems in order to provide a database as open and accessible as possible. This is why designing has given great importance to accessing the information contained in MuseoTorino through external systems that can process data independently from the site itself.
Torino Museum implements the features that define the scope of use of what is called Web 3.0 or Web Semantic. Applying the latest standards (RDFa and Open Graph) to system organization and archiving in the system will allow web indexing engines (such as Google) and social networks (such as Facebook) to different terms based on their meaning in context (http://21-style.com/blog/2010/12/museo-torino/)

These are the words of Gian Luca:

By March, basic functions and a first version of the query APIs will be provided for the navigation client (Flash + Google Maps, an alternative to text navigation). In practice, APIs will provide all the historical and cultural knowledge of the City of Turin in the form of information boards. It will be a 100% project based on a GraphDB, REST / JSON interfaces and later with RDFa and Open Graph compatibility ”

I asked Gian Luca some details.

Titti: Is the data released to everyone?
Gian Luca: The system adopts two API levels, one of the closed application type supported by the Google Map client and a public API that can be used by external systems (handhelds, applications, etc.). These APIs may also be writing to support uploading or editing data from external systems.

T .: The release license is 100% free, that is everyone can access them and reuse them as they see, so also for commercial use?
GL: On this front we still have to evaluate, together with the buyer, the ways of releasing the data, also following the presentation of the recent Open Data Italian license. The data on the system is in any case owned by the City of Turin, so it is only about deciding what rights will give to future users.

T .: what data are released in which format? open? CSV or XML or anything else?
GL: Data will be released as APIs, format, intentions, will be OpenGraph in the wake of Freebase (www.freebase.com).
For example, the Torino card is like this:http://graph.freebaseapps.com/turin while Camillo Benso is like this:
http://graph.freebaseapps.com/camillo_benso_conte_di_cavour .

Q: How is the quality of data shown?
GL: Each card is “certified” by the museum industry. In the second phase, the cards can also be modified by users as a kind of Wikipedia but will lose the “certificate” stamp until its contents will be screened by Torino Museum staff.

Q: Are the temporal validity and the updating of the same as outlined?
GL: There will always be a history of changes to a tab so that you can see when that content has been changed. It goes without saying that since the museum is 90% concentrated on the past, probably once a card is created, the updates will be minimal as the content will not need updates. Just if a building is demolished (for example) then the related card may need to be updated with a new date …

T .: and what about the data’s granularity?
GL: For the issue of granularity: Data is entered directly into the system through custom editing pages, so the data is native to the system, “certificates” as valid but (in the second phase) also editable by users, with subsequent
re-certification by Torino Museum staff.

And now we leave our auspice, which has been easy to see the “consultation” with the “sorcerer” Farada Perseu, and we come to the technical questions .. but not so much.

The emergence of semantic web technologies allows the machine a comprehensible representation of codified knowledge in web documents. With Linking Open Data ( LOD ), large, structured public domain data resources from different domains have been triplified to become RDF (S) dataset interconnected . This provides a machine with the appropriate semantics that allows simple deduction of cross-data links. The technologies Natural Language Processing(NLP), media analysis, and statistics are applied to detect semantic entities and their relationships in multimedia Web documents: those for example that I would like FINDin Museo Torino. Taking this into consideration, a semantic search engine should be able not only to get more accurate results and recall, but should also give suggestions on what is relevant with regard to content and meaning. So, exploration will really be possible, allowing the user to discover and explore the knowledge that is hidden in Web documents, and to solve complex search tasks.
But, one of the essential prerequisites for the technical implementation of an efficient semantic exploratory search engine is the accuracy and accuracy of the underlying data. This means that if a semantic search engine is built on the linked date, the search results obtained and the exploration recommendations can be as good as the quality of the underlying data sources and entities that need to be linked to the content of the web document.

Sometimes certain defects arise from structural, syntactic and semantic inconsistencies, ambiguity and lack of information, defects that need to be resolved to provide an advantage over traditional search technology by keywords and to fully exploit the explorative potential of semantic search. How do link data be used to enable exploratory semantic search?

In keyword-based search, the goal is known and the refinement process of the search should reach the desired target as quickly as possible. Conversely, exploration research assists the user in locating a long domain across paths. The user can move back and forth on alternative search paths and can then access all the underlying and related data.
Research can be grouped into ” lookup “, ” learn “, and ” investigation ” (Marchionne). A keyword-based search it is sufficient for a lookup search (recovery, question answering, etc.), but learning and investigating are exploratory research activities. If the user is not experienced in the search theme domain, the keywords that are relevant to the search are difficult to conceive.

Contrary to keyword-based search, experimental research requires active user involvement in different iterations. While the search result for a keyword is only linear, the output of an exploration can be multi-dimensional, such as cluster search results or related topics. As a result, new user interfaces are needed to display search results and data relationships to facilitate user interaction in the exploration search process.

Exploratory research includes methods to recommend alternative search paths and to suggest information about original search results. Semantic technologies are used to establish these cross-links with more information, which allows exploration of the repository.

The basis for the exploration is LOD resources and relationships. To broaden and refine search results and to allow subsequent search paths, search queries, search results should be aligned with semantic entities that are linked by content-based relationships. This allows you to extend your search scope with the ability to investigate the semantic context, different time references, or geographical references that are related to the search query or the original search results.
The semantic exploitation of a repository, if it is made up of textual or multimedia data, requires the content of its documents to be associated with corresponding semantic entities. This mapping process is referred to as entity recognition with its named entity . First, it includes identifying named entities in metadata resources or in the same resource, if represented in text format. These named entities are extracted with the help of language techniques (NLPs) and mapped with semantic entities from the LOD ( entity mapping). Named entities could be associated with various semantic entities with different meanings. These ambiguities are caused by the phenomenon of polysemy of natural language and can be resolved by word sense typewriting based on the additional contextual information.
Unlike the RDF search engines strictly as a syndicate or a sig , you can now search for semantic documents and entities at the same time. Semantic entities assigned to documents extend search capabilities for traditional keywords, perhaps covering relationships in multiple directions.
All this needs to be matched with issues of geolocation and data quality, even from the time of the time view, as well as with a graphic view that makes the user able to find what he is looking for.

I conclude by quoting from Europeana
” Open Linked Data is:
• • • 

A technology to combine the many pieces of information we get from data providers. A way to share that data with other parties. A way to give users the best possible search experience.

Here are some of the advantages of Open Linked Data for Torino Museum:
1. Linked Open Data helps generate significant links between pages.
2. Become an authority for cultural information assets
3. Analyze data and create APIs from users can generate other users that otherwise would probably never be achieved
4. Improve customer experience through better quality of information returned to ‘ user
5. direct reuse in other domains such as: education, education, tourism, science.

Get Your Data Out!

Wishing you the Next Web of Linked Open Data!

Thank you, Gian Luca! Good Year of Heart to you and to your loved ones.

Leave a Reply

Your email address will not be published. Required fields are marked *