Data engineering

 As critical components of modern information systems, databases as well as all the various forms of data structures that store and convey corporate information require robust and flexible design, maintenance and evolution methodologies. The aim of the Data engineering group is the development of models, techniques, methods and tools to support the whole lifecycle of data in order to give them the highest quality but also to maintain, evaluate and improve their quality. Thanks to a unifying framework, this wide scope research considers the most popular data models, including database and file models (legacy, current and future), paper/electronic documents and forms, ontologies, XML documents, workflow models and the web. In addition to standard engineering processes such as database analysis, design and coding, the group has also addressed processes such as database reverse engineering, reengineering, migration and evolution as well as semantic indexing of heterogeneous data sources. 

 

Themes

  • Database Design. Techniques, methods and tools have been developed for such core processes as user-driven requirement acquisition and validation, schema integration, schema quality evaluation and improvement, correct and complete conceptual schema translation into DBMS DDL.
  • Database Reverse Engineering. The reconstruction of precise database documentation is a prerequisite for after-birth database and application program maintenance and evolution processes. A comprehensive methodology encompassing most data models is available and supported by the DB-MAIN CASE tool, is available. 
  • Database Evolution. Following corporate requirements evolution, database and programs must be transformed accordingly.  Systematic techniques and methodologies have been designed and validated to help practitioners to apply appropriate changes to database schemas, database contents and application programs that use these databases.
  • Metadata modelling and management. Many modern applications require the knowledge of their environment (e.g., information, processes, organization, resources). A generic model and a metadata repository have been designed to store this evolving information.  Applications have been developed in the e-Health field.

  • Special-purpose Databases. Most applications rely on complex static and dynamic rules and on time-dependent data. Models, methodologies and tools have been built to master the development of active and temporal databases. Methods and tools have also been developed to recover heterogeneous corporate numeric data and to transform them into statistics data warehouses addressing dimension evolution issues.

     

Scientific results

The main scientific achievements rely on (1) a wide spectrum generic data model through which most past, current and future information representation models can be expressed and (2) a transformational framework with which most data engineering processes can be specified with precision. On this basis, rigorous methodologies have been developed and published for database design, reverse engineering, evolution, migration and integration for data models ranging from legacy systems to XML and ontologies.

 

Industrial results

The research has led to various industrial applications such as comprehensive methodologies for most data engineering processes and the DB-MAIN CASE tool, a programmable environment that supports all the data models and processes. Case studies and tutorials are available. These results are exploited and maintained by the ReveR spin-off.

 

Resources

books [detail], theses [detail], papers [detail], technical reports [detail]
  • Educational material

courseware [detail], tutorials [detail], textbooks [detail], etc.

  • Software
CASE tools [detail], file simulators [detail], SQL script interpreter for MS Access [detail]
  • Methodologies
database design [detail, detail], reverse engineering [detail],active databases [detail], temporal databases [detail]

Products and Services

  • Methodologies for database design and evaluation
  • DB-MAIN, a programmable data centered CASE platform
  • A comprehensive textbook in Database design
  • Training modules in Database design

Contributing projects

  • DB-MAIN (Database Engineering)
  • TimeStat (Web-based data warehouse from poor quality data sources)
  • e-Health (Data interoperability through an e-Health platform)
  • Rainbow (Deriving user-requirements from human-computer interfaces)
  • DB-Quality (Transformation-based quality evaluation of databases)
  • Quetelet.net - Instap (Web-based data warehouse for historical criminal statistics)
  • Gisele (model base for clinical pathway)
  • RISTART (Evolution of large information systems)

 [detail]

 

Former projects

  • REQUEST (Semi-automated generation of database through business objects)
  • BioMaze (Biochemical database evolution)
  • Active Database (Active Database Engineering)
  • Data Migration (Techniques and tools for data transformation)
  • RetroWeb (Techniques and tools for web document reengineering)
  • DB-MAIN/Objectif 1; Certiform (Industrial raining seminars and materials in DB Engineering)
  • InterDB (Architecture, Methods and Tools for Database Federation)
  • TimeStamp (Temporal Database Engineering)
  • DB-Process (Database Method Engineering)
  • PHENIX (Database Reverse Engineering)
  • ORGA; TRAMIS (Computer-Aided Database Design)
  • IDML (Technology-independent access to databases)
  • Large Administrative Databases(Models, Languages and Management Systems for Large Administrative databases)

 [detail]

 

Senior members


Researchers