Tokyo, Apr 3, 2013 - (JCN Newswire) - Fujitsu Laboratories Limited, the Digital Enterprise Research Institute (DERI) of the National University of Ireland Galway, and Fujitsu Laboratories of Europe Limited have jointly announced a revolutionary new data storage technology that stores and queries interconnected Linked Open Data (LOD)(1), available globally. The technology will be made available free of charge on a cloud-based platform that utilizes LOD, thereby promoting open data usage and enabling the easy utilization of enormous volumes of LOD.
The joint development program is focused on overcoming the challenges presented by the huge quantity of LOD available via the Internet, as well as the difficulties in using and processing LOD effectively. Fujitsu Laboratories, DERI and Fujitsu Laboratories of Europe have developed an LOD-utilizing platform that, using a standard Application Programming Interface (API), is capable of batch searches of billions of pieces of stored LOD data via high-speed search algorithms that are five to ten times faster than before.
With much of the available information in LOD format coming from academic and government institutions, together with individual data sets only being available from the respective organizations' websites, it has been difficult in the past to determine the data type and location. The new technology overcomes this, incorporating a function to enable visual searches of data required by applications, using a search interface that visualizes data together with its linked information. As a result, users can instantly access the data they need, without requiring application developers to search through individual websites and process the underlying data.
Fujitsu Laboratories Ltd. plans to promote the use of open data, and will be the first in the world to make the newly developed technology freely available to the public, on a cloud-based platform that utilizes LOD. (Limited availability planned from 2013)
The technology will be introduced at the international conference XBRL26, from April 16-18 in Dublin, Ireland in the form of a case study in open data usage. Full details of the technology, along with an enterprise analysis application that employs the technology, will be presented at the conference.
In the US, the public data website Data.gov was launched as part of an open government initiative, and as of March 2013, the national governments of 30 countries have released government data through websites intended specifically for that purpose. In Japan, the Cabinet's IT Strategic Headquarters promulgated an e-gov open data strategy in July 2012 and has been gradually rolling out a legal framework for open data and the disclosure of public data.
To ensure that public data can be easily processed by machines without relying on proprietary applications—a Resource Description Framework (RDF)—the World Wide Web Consortium (W3C), the main international standards organization for the World Wide Web, recommends Linked Data, which is a format involving the use of data links that are used to connect with other pieces of data, thereby making it easier for a piece of data to be discovered and used. The UK's Data.gov.uk public data site also employs the Linked Data format, and as of March 2013, a total of more than 40 billion pieces of data worldwide have been published as Linked Data. The network of data formed by these links is known as Linked Open Data.
LOD is currently published through websites operated by individual data providers. These data are available through a variety of different access methods, and many of the sites provide their own search functions to find data on their sites. From the perspective of an application developer, however, these individually developed search functions present a number of challenges: 1) It is impossible to easily determine on which site the desired data is located; 2) complex application-side processing is required to combine and process multiple kinds of data; and 3) if a site does not provide its own search function, data cannot be searched. Until now, these problems have been left up to application developers to address, which has become a roadblock to making effective use of LOD.
Newly Developed Technology
Fujitsu Laboratories Ltd. has developed a data store technology that collects and stores LOD published throughout the world and can perform batch searches on multiple kinds of data. This technology has a search interface for users, and features standard API (SPARQL(2)).
1. Distributed search technology adapted for an LOD link structure
When collecting data in a single place, the massive data structure created by the links between pieces of data requires special handling. Not only is it difficult simply to handle the larger data volumes, but developing a technology for rapidly searching complex data link structures has also proved to be a challenge. In particular, when searching for common elements that are linked together within data, it is necessary to perform comprehensive matching (cross-referencing) of massive data sets, which can lead to performance deterioration.
To facilitate search processing that requires this kind of cross-referencing, Fujitsu Laboratories Ltd. has developed a caching structure that is specifically adapted to LOD. This is employed in combination with distributed processing, thereby enabling a five-tenfold speed improvement. More specifically, each distributed server can perform cross-referencing processing by adjusting search conditions to reduce the workload in the master server and hence shorten overall processing time.
Based on the fact that links in LOD link structures are typically concentrated in only a portion of the nodes, and by taking advantage of past usage frequency, Fujitsu Laboratories Ltd. has also developed an algorithm that efficiently caches only the data that is heavily accessed in cross-referencing. The new algorithm has made it possible to reduce disk accesses and thereby accelerate searching.
2. Search interface enables bird's-eye view of LOD
The Search interface is designed to give application developers a better grasp of what kinds of data are located where. In addition to enabling searches across all data, the interface also allows searches based on statistical data that expresses the usage frequency and pervasiveness of data, as well as searches using the licenses that are assigned to data. This, in turn, makes it far easier to home in on desired data. Search results are displayed in visual form, together with the links that connect data (Figure 3), enabling application developers to visually ascertain the information they need.
By employing the newly developed technology, it is possible for application developers to obtain the information they need in a single location, without having to search across multiple public data websites. By using a standard API, developers can easily develop applications that freely combine a variety of data published through LOD.
As a case study, Fujitsu Laboratories Ltd. has developed an enterprise analysis application that employs the new technology. The application combines several data published through LOD, such as basic company information (industry category, number of employees, etc.), public financial information (revenues, profit, etc.), and stock prices, thereby making it possible instantly to analyze a company's performance from multiple angles.
From 2013, Fujitsu plans to incorporate the newly developed technology, free of charge, on a cloud-based platform that utilizes LOD, as part of its efforts to promote open data usage. In addition, Fujitsu will move to apply this technology to its business divisions involved in data utilization, leveraging it across an array of fields.
Please see www.fujitsu.com/global/news/pr/archives/month/2013/20130403-02.html for the full release with diagrams.
(1) Linked Open Data: A dataset published in the Linked Data format. As of March 2013, some 340 public-data sites used it to publish a total of 40 billion pieces of data. A typical example is DBpedia, which converts information from Wikipedia into the Linked Data format.
(2) SPARQL: A query language for RDF established by W3C. It is also used for searches of Linked Data.
About Fujitsu Laboratories
Founded in 1968 as a wholly owned subsidiary of Fujitsu Limited, Fujitsu Laboratories Limited is one of the premier research centers in the world. With a global network of laboratories in Japan, China, the United States and Europe, the organization conducts a wide range of basic and applied research in the areas of Next-generation Services, Computer Servers, Networks, Electronic Devices and Advanced Materials. For more information, please see: http://jp.fujitsu.com/labs/en.
With over 140 researchers, DERI is one of the world's leading international web science research institutes, and has a specific focus on the Semantic Web and Networked Knowledge. DERI is a Centre for Science, Engineering and Technology (CSET) established in 2003 with funding from Science Foundation Ireland (SFI). As a CSET, DERI brings together academic and industrial partners to boost innovation in science and technology, with its research focused on the SemanticWeb. DERI has leveraged its SFI CSET funding to add significant additional research funding from the European Union, Enterprise Ireland, and industry sources. For more information, please see: http://www.deri.ie.
About Fujitsu Laboratories of Europe Limited
Fujitsu Laboratories Limited has had an active presence in Europe since 1990, forming Fujitsu Laboratories of Europe Limited in 2001. The company's groundbreaking work is closely aligned to the future needs of the business community, focused on making future technologies a reality for today's businesses. Fujitsu Laboratories of Europe aims to shorten the R&D cycle to put cutting edge technologies into customers' hands as quickly as possible, enabling businesses to gain a tangible competitive advantage. Close collaboration with leading academics and experts Europe-wide forms a central element of Fujitsu Laboratories of Europe's approach, ensuring the effective pooling of expertise with other pioneers in any given field of research. Fujitsu Laboratories of Europe also participates in a number of EU research initiatives, bringing together the joint expertise of industry and academia to accelerate the development and use of new technologies on a pan-European basis. For more information, please see: www.fujitsu.com/emea/about/fle/.
About Fujitsu Limited
Fujitsu is the leading Japanese information and communication technology (ICT) company offering a full range of technology products, solutions and services. Over 170,000 Fujitsu people support customers in more than 100 countries. We use our experience and the power of ICT to shape the future of society with our customers. Fujitsu Limited (TSE:6702) reported consolidated revenues of 4.5 trillion yen (US$54 billion) for the fiscal year ended March 31, 2012. For more information, please see www.fujitsu.com.
Source: Fujitsu Limited
Fujitsu Limited Public and Investor Relations www.fujitsu.com/global/news/contacts/ +81-3-3215-5259 Technical Contacts Fujitsu Laboratories Ltd. Software Systems Laboratories Intelligent Technology Lab. E-mail: email@example.com
Copyright 2013 JCN Newswire. All rights reserved. www.japancorp.net