300 Area
Integrated Field Research Challenge: overview and available data
Background
The 300 Area Integrated Field Research Challenge (IFRC) is one of three IFRCs in the DOE - Office of Science, Climate and Environmental Science Division sponsored Integrated Field-Scale Subsurface Research Challenge program. The two other IFRCs are the Old Rifle IFRC and the Oak Ridge IFRC. In the IFRC program (started in 2008) multi-investigator teams are performing large, benchmark-type experiments addressing formidable field-scale science issues. The 300 Area IFRC is focused around the instrumentation, characterization, and investigation of a vadose zone and saturated zone field site at the Hanford 300 area. This site was instrumented using a dense well array in 2008, and a number of field and laboratory experiments have been conducted (and are still ongoing) to resolve the geochemical, hydrophysical, and microbiologic factors that control the migration of contaminant uranium through the vadose zone (water unsaturated sediments below the soil and above groundwater) and groundwater.
These field and laboratory experiments have resulted in many multidisciplinary datasets as well as multiple publications and articles. During the design of the IFRC DOE recognized the value of careful curation of these datasets, both in support of the IFRC effort and to create a multidisciplinary dataset which could be used by future scientists in scientific investigations.
How does data get on the site?
Data for the 300 Area IFRC comes from multiple scientists and laboratories, has many different forms, and is associated with many different disciplines. One consequence of this is that integrating this data completely in a relational database is a substantial challenge, and (as such integration will often require the design and implementaiton of new datamodels) will always lag behind actual data collection. In addition, some data (such as laboratory experiments) is hard to capture in a database. Thus, a two step process has been implemented.
Step one: datapackages:
The first part of this is one in which scientists create, maintain and upload datapackages (through a webinterface on the website). A datapackage is a single file (which can be a zip file containing multiple files) which contains data and metadata. The scientist can set the access level to the datapackage: the metadata is always accessible to all project participants, but access to the actual datapackage may be restricted to subsets of users. The content and form of a datapackage are left up to the scientist providing the data, but most often these are either MS Office (Word/Excel) or pdf files. The datapackage and associated editing capabilities has several advantages.
All data on the site is available in a datapackage. However, using this data effectively can require a substantial effort. For instance, different kinds of analysis performed on the same sample may exist in different files, sample results do not contain sample locations, datasets may be challenging to visualize due to size or complexity and so on. This drives the full integration in the database. What this requires is
How do I navigate the site?
Site navigation is meant to be intuitive. Once a user is logged in, navigation to a specific page happens through the menu on the left hand bar. Menu items which can expand are marked with a +. The content of each menu item should be intuitive. A brief description of what is provided under each link is given below.
Which datasets exist and where and how do I find them?
Datasets can be organized and grouped in different ways. A high level overview of the available datasets is given below, but users are encouraged to browse the whole site. If you do not find a dataset please email the database administrator - he may be able to point you in the right way.
Is there data for the 300 Area which is not in the database - and how do I find and get this data?
Research and operational efforts have been going on at the 300 Area for many years. The objective of the 300 Area IFRC datamanagement is to provide access to (in this order)
Background
The 300 Area Integrated Field Research Challenge (IFRC) is one of three IFRCs in the DOE - Office of Science, Climate and Environmental Science Division sponsored Integrated Field-Scale Subsurface Research Challenge program. The two other IFRCs are the Old Rifle IFRC and the Oak Ridge IFRC. In the IFRC program (started in 2008) multi-investigator teams are performing large, benchmark-type experiments addressing formidable field-scale science issues. The 300 Area IFRC is focused around the instrumentation, characterization, and investigation of a vadose zone and saturated zone field site at the Hanford 300 area. This site was instrumented using a dense well array in 2008, and a number of field and laboratory experiments have been conducted (and are still ongoing) to resolve the geochemical, hydrophysical, and microbiologic factors that control the migration of contaminant uranium through the vadose zone (water unsaturated sediments below the soil and above groundwater) and groundwater.
These field and laboratory experiments have resulted in many multidisciplinary datasets as well as multiple publications and articles. During the design of the IFRC DOE recognized the value of careful curation of these datasets, both in support of the IFRC effort and to create a multidisciplinary dataset which could be used by future scientists in scientific investigations.
How does data get on the site?
Data for the 300 Area IFRC comes from multiple scientists and laboratories, has many different forms, and is associated with many different disciplines. One consequence of this is that integrating this data completely in a relational database is a substantial challenge, and (as such integration will often require the design and implementaiton of new datamodels) will always lag behind actual data collection. In addition, some data (such as laboratory experiments) is hard to capture in a database. Thus, a two step process has been implemented.
Step one: datapackages:
The first part of this is one in which scientists create, maintain and upload datapackages (through a webinterface on the website). A datapackage is a single file (which can be a zip file containing multiple files) which contains data and metadata. The scientist can set the access level to the datapackage: the metadata is always accessible to all project participants, but access to the actual datapackage may be restricted to subsets of users. The content and form of a datapackage are left up to the scientist providing the data, but most often these are either MS Office (Word/Excel) or pdf files. The datapackage and associated editing capabilities has several advantages.
- First, it provides a quick and simple way for scientists to share
their data with a minimum of energy to invest - they can essentially
zip up or compress a project folder and upload it.
- Second, with the datapackages scientists provide some basic metadata (such as datatype and a brief description) which allows other project participants to locate datasets.
- Third, it provides one single point of access to datasets which are likely to change, and thus removes confusion which happens when scientists start emailing data back and forth
All data on the site is available in a datapackage. However, using this data effectively can require a substantial effort. For instance, different kinds of analysis performed on the same sample may exist in different files, sample results do not contain sample locations, datasets may be challenging to visualize due to size or complexity and so on. This drives the full integration in the database. What this requires is
- an appropriate database model
is selected. In the case where no such model exists a database model is
designed, implemented and instantiated;
- data is imported in the appropriate database tables;
- a webinterface is created or modified allowing access to the metadata and data. Both text based and map based (using the Google Maps API) interfaces are used
How do I navigate the site?
Site navigation is meant to be intuitive. Once a user is logged in, navigation to a specific page happens through the menu on the left hand bar. Menu items which can expand are marked with a +. The content of each menu item should be intuitive. A brief description of what is provided under each link is given below.
- Site and Data overview:
this page
- FAQ: Frequently asked questions and answers, including how to change passwords and to find a forgotten password
- Hanford 300 Area IFRC map: Google map showing the different wells at the site. Clicking on a well provides information on what kind of data is available for a well, and for some datasets provides a shortcut to visualizing that data
- Data packages: provides ability to upload, edit and access datapackages. The access data packages provides a basic search functionality allowing one to search on data package owner, datatype or words in the datapackage description providedby the owner
- Data by type: this provides acess to those data which has been fully integrated in the database. This includes
- Sample data: an interface to all samples and associated results
allowing one to search based on different attributes
- Well log data
- Temperature data: an interface to data from the IFRC
temperature array
- Hydrological data (data from continuous recording waterlevel, temperature and conductivity sensors in a number of instrumented wells)
- Sample data by dataset: this provides a visual interface allowing visualization of different types of sample datasets.
- Publications: a searchable list of publications, posters and presentations, including brief abstracts and links to either electronic copies or to the publishers website
- Forum: a link to a phpBB forum available to 300 Area IFRC users
- Alert: this allows users to set an alert on specific values of the USGS Priest River Dam data. Note that this is currently the only "real time" data on the site (harvested from the USGS website)
- USGS waterlevel gage:
extermally
(from USGS website) created map showing USGS gage data. Included for
users who have need to see what is going on at Priest Dam.
Which datasets exist and where and how do I find them?
Datasets can be organized and grouped in different ways. A high level overview of the available datasets is given below, but users are encouraged to browse the whole site. If you do not find a dataset please email the database administrator - he may be able to point you in the right way.
- Sitewide data. This
includes digital elevation models, aerial photos of the site and
bathymetry. Accessible as datapackages.
- Borehole logs. This includes both geological logs and geophysical logs. Accessible as datapackages, but also through the Hanford 300 Area IFRC map and through Data by primary type-> well log data. Note that the geological logs are available as pdf only.
- Sample data. Available
both as datapackages and through the Data
by primary type-> Sample Data and Sample data by dataset links
- Hydrological data: data obtained from continuous logging (typically at 15 min-1 hour intervals). This includes waterlevel but in many cases also conductivity and temperature. Accessible as datapackages and as Data by primary type-> hydrological data. This also includes data from a rivergaging station operated by PNNL in the Columbia River and real time data from the USGS station south of Priest Dam
- Interpreted geologic data (stratigraphy, interfaces). This data is provided by Bruce Bjornstad, and is available as data packages.
- Electrical geophysical characterization and monitoring data. Field data are available as data packages. Inverted results are available as data packages.
- EBF datasets: EBF
(Electromagnetic Borehole Flow meter data) has been collected multiple
times in the IFRC wells. This data is available as a datapackage
- Data from field experiments. Several large scale field experiments were conducted. The data for these experiments has been combined in experiment specific data packages by Vince Vermeul. Some of this data is also available as hydrological or sample data.These experiments are
- Non reactive tracer experiment (November 2008)
- Tracer test (March 2009)
- Cold water injection (August 2009)
- Tracer injection (October 2009)
- Passive experiment (2010-2011)
- U desorption/tracer experiment (Spring 2011)
- Data from laboratory experiments. Multiple laboratory experiments have been performed. As currently no good datamodels exist for such experiments all this data is present as datapackages.
- Numerical modeling results.
Different IFRC project participants have run multiple different
simulations. Currently the raw model realizations are not accessible
due to size issues (as each of these realizations is several hundreds
of GB). In the future this may change. However, several datapackages
describing the model results are available.
- Reports and publications.
A number of reports, publications and presentations are associated with
the 300 Area IFRC. These can be accessed through the publications
Is there data for the 300 Area which is not in the database - and how do I find and get this data?
Research and operational efforts have been going on at the 300 Area for many years. The objective of the 300 Area IFRC datamanagement is to provide access to (in this order)
- Data collected as part of the 300 Area IFRC
- Data collected by IFRC scientists under related projects (e.g. the Hanford SFA effort, SBIR projects and/or SBR funded projects)
- Data which is of direct interest and use to 300 Area IFRC scientists in their analysis of data listed under (1) and (2). This includes e.g. data from sensors in wells and river gage data
- Main PNNL Hanford website:
this is a good entry point to find out more about PNNL (Pacific
Northwest National Laboratory)
- PNNL publication page:
this website provides searchable access to all publications and reports
from PNNL scientists.
- Environmental
Dashboard Application:
this website provides searchable access to a large amount of
environmental data from PNNL. This includes access to the HEIS database
and to well logs.
- Main DOE Hanford website:
the main DOE Hanford website. This has a number of links to official
document repositories
- Public Document website -
part of DOE Hanford website:
this website has two links to two public document websites which
provide searchable reports. Many of these reports are regulatory
oriented
- Washington State well log browser: this is a map interface allowing one to look for and browse well logs from Washington State. This includes well logs from Hanford.
