Overview

The data produced by modern astronomical observations falls in the domain of BiGData. Current new wide field imaging survey instruments such as OmegaCAM (optical), VISTA (near infrared), Lofar (radio) are already superseding by an order of magnitude the data volumes of current ESO- VLT telescopes and the famous Sloan Digital Sky survey. In the future, these data volumes will be exceeded by the ALMA sub-mm observatory, the Apertif radio imager, the GAIA satellite which will observe 1 Billion of stars from 2013 onwards and the European Euclid satellite which will image dozens of billions of galaxies.

The course is meant to create understanding on how to deal with BigData, in a practical way, i.e. how to deal with and query and mine Big datasets, but also how BigData systems work and how they are designed.

The course starts with an introduction to astronomical information systems and how to use them to obtain astrometrically and photometrically calibrated astronomical images. This is followed by lectures on the scientific use of astronomical databases and information systems, including the European and International Virtual Observatory and the Astro-WISE information system. The course will cover information system theory and its practical applications. We will discuss the design of scientific information systems, the principle concepts and how they work and connect users to large databases, parallel computers, data archives, networks and Grid infrastructure. It will be explained how users can do their research with the Virtual Observatory and during the practical assignments exercises on the use of astronomical information systems will be done.

During the course students will become familiar with a number of concepts and languages, including XML, UML, SADT, SQL, R.

Throughout the course a number of practical exercises will be given and a key aim is to enable the student to reach a level where she/he can apply the tools to their own research problems. At the end of the course the student should be able to plan the optimal data processing during her/his research, to select necessary tools, to use Virtual Observatory for the data mining and publishing her/his own results.

After having taken this course you should be able to efficiently search and access astronomical databases, know the basics of how to access databases both through on-line interfaces and programatically. You will also be able to apply up-to-date statistical techniques to your data and use these to mine large datasets for scientifically interesting information.