Curr Biol:科研数据正逐年“消失”

2013-12-27 MedSci MedSci原创

原始数据经历的时间越长,人们越难得到它。这是一项新研究得出的并不令人惊讶的结果,一些生态学家和进化生物学家在该研究中追踪了在2年到22年前发表的516篇论文的作者。加拿大温哥华不列颠哥伦比亚大学的进化生物学家Timothy Vines在去年年底完成了一份关于期刊存档政策如何影响数据可用性的论文后,产生了进行该研究的想法。Vines开始思考一个更加广泛的问题:数据(或者生成数据的人)会以多快的速度消


加拿大温哥华不列颠哥伦比亚大学的进化生物学家Timothy Vines在去年年底完成了一份关于期刊存档政策如何影响数据可用性的论文后,产生了进行该研究的想法。Vines开始思考一个更加广泛的问题:数据(或者生成数据的人)会以多快的速度消失?



Vines怀疑,一些电子邮件,特别是来自较早的论文的邮件,并没有联系到 作者。研究结果的数据分析表明,一篇论文流通的年份每增加一年,其数据仍活跃的几率就会下降17%。在1991年的26篇论文中,Vines和同事承认只有两篇的数据还存在;至于2011年的论文,该数字稳步上升到该年总论文数的40%,而且如果更多的作者作出了回应,该数字将会更高。


Vines TH, Albert AY, Andrew RL, Débarre F, Bock DG, Franklin MT, Gilbert KJ, Moore JS, Renaut S, Rennison DJ. The Availability of Research Data Declines Rapidly with Article Age. Curr Biol. 2013 Dec 18. pii: S0960-9822(13)01400-0. 

评论区 (3)
  1. [GetPortalCommentsPageByObjectIdResponse(id=1917006, encodeId=9a84191e00628, content=<a href='/topic/show?id=4f26341994' target=_blank style='color:#2F92EE;'>#Bio#</a>, beContent=null, objectType=article, channel=null, level=null, likeNumber=21, replyNumber=0, topicName=null, topicId=null, topicList=[TopicDto(id=3419, encryptionId=4f26341994, topicName=Bio)], attachment=null, authenticateStatus=null, createdAvatar=, createdBy=12de429, createdName=sunylz, createdTime=Thu Apr 10 02:05:00 CST 2014, time=2014-04-10, status=1, ipAttribution=), GetPortalCommentsPageByObjectIdResponse(id=1766858, encodeId=7ff01e66858bc, content=<a href='/topic/show?id=64f5343e2f' target=_blank style='color:#2F92EE;'>#Biol#</a>, beContent=null, objectType=article, channel=null, level=null, likeNumber=29, replyNumber=0, topicName=null, topicId=null, topicList=[TopicDto(id=3437, encryptionId=64f5343e2f, topicName=Biol)], attachment=null, authenticateStatus=null, createdAvatar=null, createdBy=446937890434, createdName=12498ebem31暂无昵称, createdTime=Tue Aug 12 22:05:00 CST 2014, time=2014-08-12, status=1, ipAttribution=), GetPortalCommentsPageByObjectIdResponse(id=6478, encodeId=7e4a64e8c1, content=There was animated discussion (and a fair amount of cringing) when this paper was presented at the Peer Review Congress earlier this year (see this blog post). Needing to gather, adequately describe and store the data we analyze in a way that others can use it has major implications for the daily life of many researchers. <br> Having a spotlight shone on the issue of adequacy of data stewardship is important, but there are some issues to keep in mind. It's in a very specific area of research. Some other fields have particular regulations about the retention, privacy and sharing of data. See for example recent analyses of the availability of clinical trial data (Riveros C, 2013). <br> The numbers of papers in this study at all dwindle in earlier years: 26 in 1991 compared with 80 in 2011. Data within particular categories (such as definitely lost in any one year) are correspondingly small. <br> It was interesting that only 2.4% of studies had made their data available at the time of publication. (Those studies were excluded). <br> The authors practice what they preach: the full data are in Dryad and there's a manuscript in arXiv., beContent=null, objectType=article, channel=null, level=null, likeNumber=149, replyNumber=0, topicName=null, topicId=null, topicList=[], attachment=null, authenticateStatus=null, createdAvatar=null, createdBy=f0620, createdName=Hilda Bastian, createdTime=Fri Dec 27 13:08:00 CST 2013, time=2013-12-27, status=1, ipAttribution=)]
    2014-04-10 sunylz
  2. [GetPortalCommentsPageByObjectIdResponse(id=1917006, encodeId=9a84191e00628, content=<a href='/topic/show?id=4f26341994' target=_blank style='color:#2F92EE;'>#Bio#</a>, beContent=null, objectType=article, channel=null, level=null, likeNumber=21, replyNumber=0, topicName=null, topicId=null, topicList=[TopicDto(id=3419, encryptionId=4f26341994, topicName=Bio)], attachment=null, authenticateStatus=null, createdAvatar=, createdBy=12de429, createdName=sunylz, createdTime=Thu Apr 10 02:05:00 CST 2014, time=2014-04-10, status=1, ipAttribution=), GetPortalCommentsPageByObjectIdResponse(id=1766858, encodeId=7ff01e66858bc, content=<a href='/topic/show?id=64f5343e2f' target=_blank style='color:#2F92EE;'>#Biol#</a>, beContent=null, objectType=article, channel=null, level=null, likeNumber=29, replyNumber=0, topicName=null, topicId=null, topicList=[TopicDto(id=3437, encryptionId=64f5343e2f, topicName=Biol)], attachment=null, authenticateStatus=null, createdAvatar=null, createdBy=446937890434, createdName=12498ebem31暂无昵称, createdTime=Tue Aug 12 22:05:00 CST 2014, time=2014-08-12, status=1, ipAttribution=), GetPortalCommentsPageByObjectIdResponse(id=6478, encodeId=7e4a64e8c1, content=There was animated discussion (and a fair amount of cringing) when this paper was presented at the Peer Review Congress earlier this year (see this blog post). Needing to gather, adequately describe and store the data we analyze in a way that others can use it has major implications for the daily life of many researchers. <br> Having a spotlight shone on the issue of adequacy of data stewardship is important, but there are some issues to keep in mind. It's in a very specific area of research. Some other fields have particular regulations about the retention, privacy and sharing of data. See for example recent analyses of the availability of clinical trial data (Riveros C, 2013). <br> The numbers of papers in this study at all dwindle in earlier years: 26 in 1991 compared with 80 in 2011. Data within particular categories (such as definitely lost in any one year) are correspondingly small. <br> It was interesting that only 2.4% of studies had made their data available at the time of publication. (Those studies were excluded). <br> The authors practice what they preach: the full data are in Dryad and there's a manuscript in arXiv., beContent=null, objectType=article, channel=null, level=null, likeNumber=149, replyNumber=0, topicName=null, topicId=null, topicList=[], attachment=null, authenticateStatus=null, createdAvatar=null, createdBy=f0620, createdName=Hilda Bastian, createdTime=Fri Dec 27 13:08:00 CST 2013, time=2013-12-27, status=1, ipAttribution=)]
  3. [GetPortalCommentsPageByObjectIdResponse(id=1917006, encodeId=9a84191e00628, content=<a href='/topic/show?id=4f26341994' target=_blank style='color:#2F92EE;'>#Bio#</a>, beContent=null, objectType=article, channel=null, level=null, likeNumber=21, replyNumber=0, topicName=null, topicId=null, topicList=[TopicDto(id=3419, encryptionId=4f26341994, topicName=Bio)], attachment=null, authenticateStatus=null, createdAvatar=, createdBy=12de429, createdName=sunylz, createdTime=Thu Apr 10 02:05:00 CST 2014, time=2014-04-10, status=1, ipAttribution=), GetPortalCommentsPageByObjectIdResponse(id=1766858, encodeId=7ff01e66858bc, content=<a href='/topic/show?id=64f5343e2f' target=_blank style='color:#2F92EE;'>#Biol#</a>, beContent=null, objectType=article, channel=null, level=null, likeNumber=29, replyNumber=0, topicName=null, topicId=null, topicList=[TopicDto(id=3437, encryptionId=64f5343e2f, topicName=Biol)], attachment=null, authenticateStatus=null, createdAvatar=null, createdBy=446937890434, createdName=12498ebem31暂无昵称, createdTime=Tue Aug 12 22:05:00 CST 2014, time=2014-08-12, status=1, ipAttribution=), GetPortalCommentsPageByObjectIdResponse(id=6478, encodeId=7e4a64e8c1, content=There was animated discussion (and a fair amount of cringing) when this paper was presented at the Peer Review Congress earlier this year (see this blog post). Needing to gather, adequately describe and store the data we analyze in a way that others can use it has major implications for the daily life of many researchers. <br> Having a spotlight shone on the issue of adequacy of data stewardship is important, but there are some issues to keep in mind. It's in a very specific area of research. Some other fields have particular regulations about the retention, privacy and sharing of data. See for example recent analyses of the availability of clinical trial data (Riveros C, 2013). <br> The numbers of papers in this study at all dwindle in earlier years: 26 in 1991 compared with 80 in 2011. Data within particular categories (such as definitely lost in any one year) are correspondingly small. <br> It was interesting that only 2.4% of studies had made their data available at the time of publication. (Those studies were excluded). <br> The authors practice what they preach: the full data are in Dryad and there's a manuscript in arXiv., beContent=null, objectType=article, channel=null, level=null, likeNumber=149, replyNumber=0, topicName=null, topicId=null, topicList=[], attachment=null, authenticateStatus=null, createdAvatar=null, createdBy=f0620, createdName=Hilda Bastian, createdTime=Fri Dec 27 13:08:00 CST 2013, time=2013-12-27, status=1, ipAttribution=)]
    2013-12-27 Hilda Bastian

    There was animated discussion (and a fair amount of cringing) when this paper was presented at the Peer Review Congress earlier this year (see this blog post). Needing to gather, adequately describe and store the data we analyze in a way that others can use it has major implications for the daily life of many researchers.
    Having a spotlight shone on the issue of adequacy of data stewardship is important, but there are some issues to keep in mind. It's in a very specific area of research. Some other fields have particular regulations about the retention, privacy and sharing of data. See for example recent analyses of the availability of clinical trial data (Riveros C, 2013).
    The numbers of papers in this study at all dwindle in earlier years: 26 in 1991 compared with 80 in 2011. Data within particular categories (such as definitely lost in any one year) are correspondingly small.
    It was interesting that only 2.4% of studies had made their data available at the time of publication. (Those studies were excluded).
    The authors practice what they preach: the full data are in Dryad and there's a manuscript in arXiv.




数据的标准化(normalization)是将数据按比例缩放,使之落入一个小的特定区间。在某些比较和评价的指标处理中经常会用到,去除数据的单位限制,将其转化为无量纲的纯数值,便于不同单位或量级的指标能够进行比较和加权。 其中最典型的就是数据的归一化处理,即将数据统一映射到[0,1]区间上,常见的数据归一化的方法有: min-max标准化(Min-max normalization)


    硕士论文是一面镜子,或许能反映出学生以后的人生。硕士论文是要保存在国家图书馆的,也可以在网上查到的。硕士论文将会伴随你的一生还远。有的人硕士期间就做出了影响其一生的成就而载入科技史册。如张泽院士在金属所读硕士期间的工作影响了他的一生。     硕士论文在一些普遍性的共性问题,值得老师们注意:    


在EXCEL中应用图表来表现数据信息,要比单纯的数字更明确,更直观,让人一目了然。但有时我们在实际工作中还会碰到这种情况,即数据与省份有关(如图1),这时虽然也能用图表的方式来表现,但如果能将数据和地图结合起来,将会收到更加好的效果。利用Office2000中集成的数据地图功能,我们可以方便地完成上述操作。  由于采用Office2000典型的安装方式没有安装数据地图,所以首先请运行Office2


Excel中提供了两种数据的筛选操作,即“自动筛选”和“高级筛选”。如何区分这两种筛选模式,以便熟练掌握和应用,让我们来看看吧:自动筛选“自动筛选”一般用于简单的条件筛选,筛选时将不满足条件的数据暂时隐藏起来,只显示符合条件的数据。图1为某单位的职工工资表,打开“数据”菜单中“筛选”子菜单中的“自动筛选” 命令,以“基本工资”字段为例,单击其右侧向下的列表按钮,可根据要求筛选出基本工资为某

附表2 Excel 数据分析工具一览表

附表2 Excel 数据分析工具一览表 “F - 检验:双 样本方差分析” 分析工具 此分析工具可以进行双样本F - 检验,用来比较 两个样本总体的方差。例如,可以对参加游泳比赛 的两个队的时间记分进行F- 检验,查看二者的样 本方差是否不同。 “t - 检验:成 对双样本均值分