Minutes of the Second Workshop on Databases in Stellar Spectroscopy
Paris : 18-19 October 2006


Meeting information
Participants


Last minute :

Workshop: "Astronomical Spectroscopy and Virtual Observatory"
               European Space Astronomy Centre of ESA
         Villafranca del Castillo (Spain), 21-23 March 2007


Preamble of the meeting

The aim of this Second Workshop on Databases in Stellar Spectroscopy is to bring together French producers and users of stellar spectra in order to :
- present new achievements and tools that give access to, analyse and use stellar spectra, either observed or synthetic, particularly in the Virtual Observatory
- present scientific projects and needs concerning reference spectra, e.g. for the Gaia project or the synthesis of stellar populations
- discuss how actions could be coordinated.

The minutes contain a summary of the presentations as well as links to the presentations and the projects themselves. The minutes also include the discussions.


Aims of the meeting : C. Soubiran pdf
In this meeting we want to draw up a list of the achievements, new projects and needs in stellar spectroscopy, 3 years after the first meeting that we had on the subject. We
also want to learn about available tools and standards in the VO. We also provide an opportunity for users to express their needs to developers, and to discuss possible coordination to avoid duplicate efforts.

Libraries of stellar spectra are of great interest in several fields of astronomy : in galactic astronomy they are used in the synthesis of stellar populations, and in stellar physics they are used to validate models of stellar atmospheres. There are new echelle spectrographs, producing thousands of high quality spectra that can be made available  for a better scientific return of the instruments. Projects like RAVE and Gaia will produce even more stellar spectra, but such projects also need to calibrate their results using reference libraries.


At the present time, databases of stellar spectra exist at several levels :
- archives of raw data
- science-ready archives (reduced and calibrated spectra)
- specialized databases (with a specific objective like reference spectra, hot stars or spectrophotometry)
- libraries of stellar spectra, representative of a domain of parameters

There is a lot of activity in all these domains, with many new libraries (ELODIE.3, MILES, S4N, Asiago DSD, ..), science-ready archives developed in France (ELODIE-SOPHIE, VLT-GIRAFFE, ESPADONS-NARVAL), databases for large projects (Gaudi and Exo-dat for COROT, database for Gaia), the development of parametrization tools like MATISSE and TGMET. Other communities are also very active in spectroscopy (e.g.  CASSIS which is a tool for spectroscopy of the ISM).

Users need such tools and databases to be available in order to gain in efficiency : the homogeneity of formats and queries, and interoperability make it possible to combine or compare spectra coming from  different instruments or use common database tools. Quality is crucial : users must be sure that the data is of good quality, correctly calibrated. A detailed description of observations, reduction and calibration must be available, as well as a description of the domain of validity, internal and external errors.

Here are some examples of what we would like to be able to do :
- combine Stellar Spectra DataBases (SSDB) with other SSDB, catalogues or surveys
- combine data from UV to IR
- combine data of different epochs
- use graphical tools (plots, overplots..)
- use software packages (IRAF, MIDAS, ..) or home-made software
- identify lines (VALD,..)
- browse on-line bibliography and catalogues (ADS, Simbad, VizieR,..)
- measure radial velocities, compute correlations with masks
- measure EW, fit line in an automated way
- perform operations on spectra (average, division,..)
- compare synthetic and observed spectra
- compute on-line a spectrum with given parameters
- interpolate spectra
- determine on-line the parameters of a spectrum
- change resolutions
- go easily from one spectrum to another

Are tools like VOspec, Specview, VO-SPLAT able to do such things ? What databases of stellar spectra are currently available in the VO ?

The growing national and international structuration of the VO is essential for progress. The IVOA works on standards of interoperability and techniques. Euro-VO works at the level of technology and astronomical data centres, and  promotes VO science in Europe. The French ASOV coordinates national projects and organizes workshops and tutorials for developers and users.

In France, most laboratories are involved in a project related to VO, providing a range of services. The French community is also largely involved in international collaborations for the development of tools and standards.

Spectroscopy and the VO : Ph Prugniel pdf
We want to use the VO for queries similar to those we handle through ADS or Google. We want to select data on physical criteria and perform operations on this data. Services in spectroscopy are still at the level of prototypes. The problem of standardisation and description has not been yet fully solved. Interactions between developers and astronomers must be reinforced.
A registry is like yellow pages : it gives a list of available services. A client can be the astronomer or an application (Aladin, or any VO service).

VO standards :
- data exchange : FITS and VOTABLE
- data models : how to say that my data is a solar spectrum at R=100000 (tree structure) ?
- protocol access : standard query (SIA and SSA not definitively accepted)
- SQL

Science cases should be given.
There are 3 tools to develop databases : Pleinpot, SAADA and SiTools.
There will be a workshop on tools at the Observatoire de Paris, organized by P. Le Sidaner.

SOPHIE : H. Le Coroller pdf
SOPHIE,  the new spectrograph that replaces ELODIE on the 1.93m telescope at OHP, gives 39 orders covering the spectral domain : 3872-6943 A. It has 2 modes : high resolution and high efficiency. It is an excellent intrument for stellar physics, with high luminosity and low diffuse light. Its on-line reduction software is adapted from HARPS. Observers build in advance their catalogue of observations with mandatory fields which are written as key-words in the header. The Data Reduction Software is run automatically with a quality control that eliminates problematic exposures  (saturation, drifts,..). Bias subtraction, flat-fielding and cosmic correction are effected automatically. The SOPHIE database gives access to several FITS files : wave, ccf, blaze, e2ds, s1d (reconnected, resampled orders). It is also planned to make the correlation tools available on line. What takes most time is the quality control needed to verify the identifiers (agreement with telescope position, for instance). It is planned to put a parametrization algorithm (TGMET, MATISSE) on line.

ESPADONS - NARVAL : M. Aurière pdf
NARVAL is the new spectropolarimeter that will be available at the Pic du Midi by the end of 2006. It is a copy of ESPADONS already in use at CFHT. Both intruments give 40 orders covering  370 to 1050 nm in 3 modes of observation : (1) polarimeter with R=65000, (2) spectrometer on object + sky at R=65000, (3) spectrometer on object only at R=80000.  At the moment the raw data is stored alongside the telescope, and spectra are automatically reduced in real time, and new reductions can be made at will. A typical reduced spectrum is a file of 15 Mo. A total of 100000 spectra are expected over the next 10 years. Other files are associated with the reduced spectra : the result of the reduction, exposure information, Simbad information, automated classification, a link to Pollux. It is planned to develop tools enabling the archive to be queried using various criteria, and to adopt VO standards.
Current actions are to make the Esprit-Libre reduction software operational, to adapt and test the Matisse parametrization algorithm, to make synthetic spectra from Pollux available and to observe reference stars. All ESPADONS spectra will be reduced homogeneously. It would be very interesting to have a link to a database of stellar parameters. Four astronomers are building the database but technical maintenance has not yet been provided. There is a lack of specific VO competence.

GIRAFFE at VLT : F Royer pdf
The instrument has 3 modes : ARGUS, IFU, MEDUSA. It has been in action since April 2003. 2000 images per year are made public after 1 year.. In the ESO archive, raw data and calibration exposures are available, whereas the GIRAFFE archive gives science-ready data with added value from identification and selection criteria. The VLT pipeline will soon be available to create reduced spectra and effect quality control.

ELODIE workflow : A. Sarkissian pdf
A science case has been developed as an example of what can be done with the VO : finding exoplanet signatures in existing datasets. An automatic search has been done on 200 ELODIE spectra from the Queloz & Mayor programme, Simbad has been queried via ELODIE archive, as well as the exoplanet encyclopedia, and the BASECOL database for the spectral analysis. For 51 Peg, there are 40 public spectra which have been downloaded. Spectral lines have been identified and fitted with a gaussian, and the Doppler shift subsequently being represented as a function of time.
Another interesting workflow studies the variations of the water vapour in the atmosphere above OHP.

EXO-DAT an information system for COROT : M. Deleuil pdf
The satellite COROT has two objectives : the asterosismic study of several targets with V<9 mag (main targets at V=6 mag), and the search of exoplanet signatures among 12000 targets with 11<V<16 mag. In order to help the analysis of the light curves, it is necessary to have information on individual targets and on the stellar content of the sample. A ground-based observing programme has started including UBVRi on INT at La Palma (107 stars), a variabilty survey with the BEST telescope at OHP, and spectroscopic observations at VLT-GIRAFFE on 700 stars to make the spectral classification. The database Exo-DAT was build to archive these various observations (photometric, spectroscopic, time sequences), together with catalogues and classification tools. SiTools from the CNES was used to build the interface. The specifications were described with the help of computer scientists, as part of the OAMP ENVOL project. Engineers feed the database with observed data, but this technical part should be reinforced. At the present time there is a lack of tools, visualisation tools for instance.

Archive of the Buryakan Survey : C. Rossi pdf
There is collaboration between Italy and Armenia to make an archive from the prism plates of the Buryakan Survey. 1780 plates are being digitized. An astrometric solution from GS2 is used to extract the spectra (see http://buryakan.phys.univroma1.it/ ) The spectra concern objects in the magnitude range 12-17, with low overlap with SDSS. The main problem now concerns automatic flux calibration in order to perform a systematic classification of the targets.

Automated parametrization of hot stars : C. Martayan pdf
CRFIT is a minimum distance algorithm to find the best fit between an observed and a synthetic spectrum. Atlas9 is used for the line lists and abundances. The algorithm has been tested on large samples of stars of the Magellanic Clouds. The algorithm is fully automatic for B stars, but for Be it has to be interactive. There are 2 methods of classification : line broadening or fundamental parameters. The algorithm has been applied to massive stars extracted from the ELODIE archive, mostly using the blue side of the spectra. In some cases there is a clear problem of continuum in the reconnected spectra. The results show satisfactory agreement with TGMET and TGM algorithms.

BESS, Be Stars Spectra : C. Neiner pdf
The BeSS database was built to collect Be spectra from various instruments. The spectra can be downloaded and uploaded. Before being entered into the database, uploaded spectra are verified for their format. Information about the star, the spectrum and the exposure is searched using key-words and CDS. The database is on an Apache server, and uses PostGres SQL and Pleinpot.
It will be avalaible to the public by December 2006. It will be compatible with VO when spectroscopic standards have been definitively adopted and stablized. In 2007, the V2.0 version will also include the fundamental parameters of the stars. Three people work on the database. Data is verified,  by just a person, as it comes in. Any registered observer can feed the database.

MATISSE : A. Recio-Blanco pdf
MATISSE (MATrix Inversion for Spectral SythEsis) is an algorithm for the automated derivation of stellar atmospheric parameters and chemical abundances, planned to be used as the GSP-Spec (Generalized Stellar Parametrizer - Spectroscopic) in Gaia. It was specially developed to be applied on RVS spectra that include the CaII triplet and Fe and alpha lines. Compared to other methods like minimum distance and neural networks, it shows better performances. The minimum distance method is too slow for Gaia. Neural Networks are like a black box with large systematic errors in certain cases. The non-linearity of the parameter space has to be considered. The  MATISSE algorithm determines a basis, B_\theta(\lambda), enabling a particular stellar parameter \theta to be derived by projection of an observed spectrum. The B_\theta(\lambda) function is determined from an optimal linear combination of theoretical spectra and it relates, in a quantitative way, variations in the spectrum flux with variations in \theta. A grid of 10000 spectra has been built. Performances on the determination of [alpha/Fe] for the RVS domain and resolution are sufficent to provide constraints on  galactic models of chemical evolution. The B functions can be vizualised to understand the underlying physics. Once training has been done, the algorithm is very fast. It can be applied to other spectra. There is a project of 3000 VLT FLAMES spectra to search for vertical gradients in the thick disk.

POLLUX : A. Lèbre, E. Josselin
POLLUX, the database of reference spectra developed at GRAAL in Montpellier, has 3 axes of development : observed spectra, computed spectra and applications. High priority is given to the theoretical part. A grid of synthetic spectra is being computed in the range 300-1200 nm at R=150000. There are 2 versions of the spectra : continuum normalised and absolute flux calibrated. The HR diagramme is covered with variance in Teff, logg, [Fe/H], [alpha/Fe], CNO abundances. Models are MARCS, ATLAS, TLUTSY, CMFGEN. Atomic data comes from various sources (VALD..). Current work concerns the description of the data (input model, and reference of lines must be described), the as-and-when convolution, correlation tools with masks, descriptors for Gaia. A single spectrum is stored in ASCII format (15 Mo). There is a problem with CPU time to compute the spectra. The database is developed using Python - Postgres. Validation of the synthetic spectra must be done. The maintenance of the database is ensured by a researcher and an engineer. The web interface has been created but has not yet been opened.

Access to atomic and molecular data in the VO : M.-L. Dubernet
Atomic and Molecular Databases at Observatoire de Paris
Abstract for ICAMDATA05, Meudon-France, October 2006

Gaia : C. Soubiran pdf
Gaia  : 109 stars observed; in astrometry the accuracy is 20 micro-arcsec at V=15 mag, better than 300 micro-arcsec at V=20mag; in photometry there is broad band G, and dispersed bands BP-RP equivalent to 30 narrow bands; in spectroscopy the medium resolution instrument RVS  will give access to radial velocities, rotational velocities, atmospheric parameters and abundances. Launched in December 2011 for 5 years of observation with uniform  coverage of the sky (on average, 100 observations per object, 50 for RVS). Publication of data in 2020. Scientific objectives : Galaxy cartography, kinematics and dynamics, stellar physics, extragalactic distance scale, dark matter, age of the universe, reference system, exoplanet detection, fundamental physics, solar system studies, solar physics. The Data Processing is organised as a consortium with national funds. DPAC has 276 members (68 French) who participate in one or several coordination units which are in charge of a number of tasks.
The RVS instrument covers 847-874 nm at R=11500. Its data processing is done in CU6. There is no calibration on board, so that the zero of the radial velocities  must be calibrated with reference sources (stars and asteroids with V<10mag) well-known in advance, and stable at the level of 300 m/s. Observations are planned in order to set up a grid of 2000 reference stars that must be selected with precise criteria and observed at least twice to verify their stability. Similarly, reference stars must also be selected for the astrophysical parametrization (CU8), covering the whole HR diagramme with variance in metallicity. These stars, well-known in advance, will be observed by Gaia to calibrate the 2 Generalised Parametrizer (GSP-phot and GSP-spec) and the Extended Stellar Parametrizer (ESP).  Two reference grids must be built : one bright for the spectroscopy, and one faint for the photometry. An observing programme to select the reference stars and measure their parameters homogeneously, has been planned. Since there will be a large volume of auxiliary data (catalogues of radial velocity, atmospheric  parameters,  basic data, observed spectra, measured parameters, ..), a dedicated database has been built in order to facilitate the compilation of the auxiliary data, to easily feed the database and share it with several other groups.

Spectroscopic databases at the AIP : A. Siebert pdf
The AIP hosts two spectroscopic databases : RAVE and the GAIA CU6/8  calibration data archive. The current status and future development of  the two databases have been presented. With the (soon) available VO standard for spectra, the RAVE database will evolve to a full VO service, enabling easy access and use of the RAVE spectra. Public release of the RAVE spectra is planned for 2008.

The ELODIE library : Ph. Prugniel
A new version of the ELODIE library is under construction, with new stars and improved flux calibration.

Analysis of the stellar populations : Mina Koleva pdf
The parameters of the SP can be derived by comparing observations with models of integrated light spectra. These models are usually synthetic evaluative codes using different IMF, Isochrones and stellar libraries.
The following points are shown :
1) comparison of the different synthetic models (including the comparison of stellar libraries, isochrones ...)
2) comparison of results obtained from spectral fitting of real objects (GCs) with that obtained using other techniques (CMD) 


Conclusion and synthesis of the discussions

1 - A positive evolution of the projects
Since the first meeting on the subject, three years ago, the situation has evolved in a positive way. Some of the projects are now operational and give access to stellar spectra via public interfaces. Several new projects have appeared. These different projects have adopted various technical solutions. The manpower, which seemed inadequate  3 years ago, now meets the objectives. It is worth noting that several recent recruitments of researchers are associated with these projects.

2 - Virtual Observatory, data model and tools
The Virtual Observatory has also very much evolved but its progress is still limited due to the lack of approved standards. 
SSAP
(Simple Spectra Access Protocol) has not yet been approved.

Unfortunately, it was  not possible to have a demonstration of several tools which are already available to handle VO-SSAP compliant spectra. Here are the most common tools :
VOSpec
SPLAT-VO
Specview

Giraffe Archive, FUSE and Hyperleda use technical solutions offered by the VO; they also take avantage of VO tools, even if these are still under currently undergoing development. Other projects are waiting for the approval of SSAP and more evolved tools.

Several groups have their own actions outside the VO to describe their data. For instance, synthetic spectra need a special effort of description, which is particularly urgent because of  their use in the preparation of Gaia.

Interactions between astronomers and developers have to be reinforced to stimulate the development of standards and tools, and to enlarge the content of the VO.

3 - Use cases
A list of actions that astronomers usually perform with their stellar spectra was presented
(see Aims of the meeting by C. Soubiran). It is planned to complete this and to extract the most characteristic needs and look, in detail, at how they are currently satisfied, and how the VO could better satisfy them.

4 - Organisation
The technical work and feeding of the databases are not very demanding in manpower. What is clearly expensive in time is quality control, which requires scientific expertise.

The projects associated with spatial missions must satisfy certain constraints of time and usually benefit from greater technical support. The other projects have to find a balance between their objectives and available means. None of the presented projects seems to suffer from a critical situation in terms of either manpower or means.

Several projects share experiences and tools, e.g. Hyperleda, ELODIE-SOPHIE and Giraffe, or RAVE and Gaia.

There is a clear demand from participants to share their expertise and means, and this meeting was an opportunity to establish contacts. There was also a request for information about VO standards and tools to be relayed because not all participants are willing to follow  IVOA activities and other VO meetings
closely. This is clearly what the spectroscopic group of ASOV should do in the next few months. 

5 - Conclusion
The French community is very active in several projects which aim at giving access to stellar spectra, in the form of archives,  libraries or specialised  databases.  Homogenization  of the access  and tools, and interoperability of the databases  are necessary to  have a better scientific return of these resources. Interactions between astronomers and developers must be improved in order to follow through on the propositions made in the VO and to formalize, as well as possible, needs and solutions.