Update(MM/DD/YYYY):05/20/2005
Development of Private Data Protection and Management System through Grid Technology-Based Open Source Software Originated from Japan
The National Institute of Advanced Industrial Science and Technology (AIST:President, Prof. Hiroyuki Yoshikawa), an independent administrative institution, has developed a Private data protection and Management System (PDPMS) based on the outcome of a joint project "Enhancing Information Security through Grid Technology" having been carried out since April 2004 with the NTT Neomate Corporation(President, Ken-ichi Nishimura).
The PDPMS is characterized by these features: <1> private data are pulverized or atomized randomly to disable personal identification, <2> thoroughly mixed up and combined into meaningless characters block,, then <3> stored at multiple data centers in a randomly distributed manner by use of grid technology. A metadata shows where the data exits are stored and encrypted at different multiple data centers.. It will be very difficult, therefore, for a malevolent party or a third person to disclose personal data. Distributed storage of duplicated data at different places by use of grid technology will reduce the risk of losing data at the time of accident or natural disaster.
The outcome of the present study will be used at the "AQStage PF IP Call Center" provided by NTT Neomate, as well as for application services handling various personal data. The two parties will keep the collaboration continued in the future to ensure synergistic effects of industrial-governmental cooperation and to contribute to the dissemination and expansion of grid technology in the business sector. Findings of the R&D work are being filed for business model and technological patent.
1. Background and Objective of Collaborative Development
(1) NTT Neomate
Now that the leakage of private data becomes social issues and the full scale enactment of the Act for Protection of Computer Processed Personal Data held by Administrative Organs has been set in April 2005, the security of personal data entrusted directly by clients or from specific enterprises under the commission with a call center business has become an important task for enterprises dealing with massive personal data. Under such a circumstance, the NTT Neomate has been attempting to further upgrade the security of personal data stored at a data center as a part of activities in the AQStage PF IP Call Center Service.
(2) AIST
The Grid Technology Research Center (GTRC) of AIST has been engaged in research and development of "GfarmTM", a system software for processing massive data, as immense as petabytes (peta = 1015) derived from genetic and astronomical data analysis in collaboration with multiple centers distributed over the world, leading the international standardization of grid technology. Besides, the GTRC is making efforts to promote joint development with the industrial sector in preparation for R&D of grid technology.
The work may be regarded as a worthwhile attempt for demonstrating the usefulness of the Gfarm, a distributed file system for massive data developed by the GTRC-AIST, in the general business world. It is of great significance that the Gfarm is applicable to the private data protection and management system and to the extensive business areas, in view of demonstrating the potential of advanced software originated from Japan being helpful not only for academic fields but also for various business activities and construction of electronic government.
On the ground of concordance between objectives of the two organizations, the collaboration took place.
2. Overview and Features of Private Data Protection and Management System (Outcome of R&D Works)
The overview and features of the newly developed system are described below:
|
Fig. 1 Overview of private data protection and management system
|
(1) Private data are pulverized into the level of individual characters to disable personal identification.
(2) Isolated characters are mixed up and combined into meaningless characters block.
(3) Private data, pulverized and admixed, are stored in a dispersed manner in file servers at different data centers by use of grid filesystem software Gfarm.
(4) Private data, pulverized, admixed and disperse-stored, have storage site data encrypted and cached in a separate metaserver for the sake of retrieval. Keeping pulverized data separately from storage site data, instead of combining tagged data, ensures that a malevolent party or a third person having taken data put of the file server in the data center cannot restore them.
(5) Data are duplicated and mixed with those from another file server, before being stored in dispersed manner in three or more file servers. In this way, possible risk in the event of natural disaster with one of data centers damaged may be avoided by restoring data through the use of file server in an undamaged data center.
(6) In contrast to the earlier concept of storing private data in dispersed manner by fragmenting into the level of files, the newly developed system is characterized by pulverizing data into the level of words, storing them in dispersed way through the use of grid technology, and encrypting and storing storage site data in a different server. The new technology is being filed for business model and technological patents.
3. Worth of "Private Data Protection and Management System" for IP Call Center
The Private Data Protection and Management System is applied to the IP Call Center in the following way (Fig. 2).
|
Fig. 2 IP Call center system secured by grid technology
|
(1) When starting the business of the call center, a list of private data is created and temporarily stored in the customer relationship management database (CRM-DB) server in the data center X as a cache from file servers in the data centers, A, B and C. However, as the call center business is finished or the CRM-DB server is powered off, the cache is immediately deleted.
(2) As private data are stored in individual file servers and data servers in the form of meaningless information, any malevolent party or third party, who takes data out at the time of system upgrade for servers or storage part exchange for repairing purpose, will fail to retrieve any personal data.
(3) It will be very difficult to bring data out owing not only to using devices disabling collecting, downloading or bringing out irrelevant data, in the operator terminals in the call center, but also to strict management of data transmission and reception.
In this way, it becomes possible to implement an IP call center with intensified security, making it extremely difficult for a malevolent party concerned with the call center operator system or a third person to leak private data.
|
Fig. 3 AQStage PF IP Call Center Service
|