| Abstract |
In this paper, we address the issues associated with the development and sustainability of a network of data sites and databases hosted by academic-based groups. These are the groups who, over the years, have conducted the majority of research on geothermal systems while educating the next generation of geothermal professionals and researchers. A network can be envisioned as an Internet-connected series of nodes (data sites), that allow for a common approach to finding data among the linked sites. Each of the co-authors' groups have for years collected and provided data to researchers, industry, state and federal agencies, and the public. Over the last ten years, the better management of and access to data have become an increasingly important part of not only the research process, but also for industry as they push forward with delineating and producing geothermal resources, for state and federal agencies to help them meet their missions and mandates, and to inform the public on the importance of geothermal energy. Indeed both Congress and the White House continue to strengthen the bipartisan goal of free and open access to all data created by the federal dollar. For academic-based data sites, the challenges are many, but not insurmountable. For example, we need to provide seamless linkages of data to analysis and visualization tools, in particular high-level modeling programs and required computational resources. When dealing with research results and industry partnerships, moratorium and proprietary data must be handled carefully and securely. At the same time, we have to make it as easy as possible for users to discover, aggregate and synthesize data in ways that allow them to focus on the analysis of these data rather than on finding and compiling them. We also must better utilize these research-level data within the education enterprise to train and attract our next generation of geoscientists and geoengineers. Data discovery and sharing among multiple data sites is a persistent issue. Each of the sites in the network hosts some unique data but a substantial number of data types may overlap among some of the sites, complicating the process. The notion of the "semantic web" as the solution for data integration has sparked continued debate because of its underlying requirement for ontologies and single-definition vocabularies. A hybrid approach is now evolving that utilizes some tools of the semantic web but recognizes the investment in current data sites and the different needs of the various user communities. This approach also recognizes that much "knowledge" (the ultimate value of a data network) is wrapped up in differences among vocabularies, languages and concepts and that forcing singularity decreases the knowledge value of data and their resulting data products. This approach makes data sharing more difficult, but not impossible, and importantly opens up the system for broader participation and collaboration. Sustainability will always remain an issue for an academic-based data network. Self-funding from the home institution is not feasible. However as agencies continue to increase their efforts to manage and provide access to data generated by projects and activities they fund, a long-term agency-academic partnership will evolve that includes both funding academic-based data networks and relying on these networks to provide some of that public access to federally-funded data. Finally, it is important to note that these academic-based data networks have to be self-governing for them to work at all, much less be sustainabl |