本文已被:浏览 1640次 下载 0次
投稿时间:2015-01-11 修订日期:2015-03-11
投稿时间:2015-01-11 修订日期:2015-03-11
中文摘要: 在异构环境下,目前数据起源研究主要基于OPM模型来表示数据在ETL中的来源过程,存在着起源概念不统一、词汇使用混乱以及无法提供标准化访问等问题。基于W3C的PROV模型,提出了ETL起源信息的统一表达机制。该机制首先对ETL过程的起源概念及其关系进行了统一描述。然后,针对ETL过程特殊的语义表达需求,建立了多粒度的ETL起源词汇表。最终,建立在RDF之上的标准化查询机制提高了起源信息的可访问性。
Abstract:In heterogeneous environment, data provenance information in ETL is represented on the basis of OPM.However,there is still a lack of consensus on conceptual representation of ETL provenance information,usage of provenance vocabulary and a consolidated access mode.A unified provenance representation mechanism,which was based on PROV,was proposed for ETL.Firstly,it presented a concept representation mechanism for ETL,which demonstrated primary provenance concepts and their relationships.Secondly,it constructed a multi-granularity vocabulary to fulfill the requirement of expressing provenance information on different abstraction levels.Finally,a standard access mode was proposed in which provenance information was organized into two levels,the bottom one was described with RDF,and the above level was formed based on query of the former.
keywords: ETL dataprovenance interoperability PROV OPM
文章编号:201500037 中图分类号: 文献标志码:
基金项目:国家自然科学基金面上项目资助(61170306)
作者简介:
引用文本:
柯洁,董红斌,梁意文,谭成予,艾勇.基于PROV的ETL起源信息统一表达机制[J].工程科学与技术,2015,47(5):123-129.
KeJie,DongHongbin,LiangYiwen,TanChengyu,AiYong.APROVBasedRepresentationofDataProvenanceforETLProcess[J].Advanced Engineering Sciences,2015,47(5):123-129.
引用文本:
柯洁,董红斌,梁意文,谭成予,艾勇.基于PROV的ETL起源信息统一表达机制[J].工程科学与技术,2015,47(5):123-129.
KeJie,DongHongbin,LiangYiwen,TanChengyu,AiYong.APROVBasedRepresentationofDataProvenanceforETLProcess[J].Advanced Engineering Sciences,2015,47(5):123-129.