Summarizes the categories of LSST data products, based on the definitions in the SRD (LPM-17) and DPDD (LSE-163).
Summarizes the categories of LSST data products, based on the definitions in the SRD (LPM-17) and DPDD (LSE-163).
LSST must supply trusted petascale data products. The mechanisms by which the LSST project achieve this unprecedented level of data quality will have spinoff to data-enabled science generally. This document specifies high-level requirements for a LSST Data Quality Assessment Framework, and defines the four levels of quality assessment (QA) tools. Because this process involves system-wide hardware and software, data QA must be defined at the System level. The scope of this document is limited to the description of the overall framework and the general requirements. It derives from the LSST Science Requirements Document [LPM-17]. A flow-down document will describe detailed implementation of the QA, including the algorithms. In most cases the monitoring strategy, the development path for these tools or the algorithms are known. Related documents are: LSST System Requirements [LSE-29], Optimal Deployment Parameters Document-11624, Observatory System Specifications [LSE-30], Configuration Management Plan [LPM-19], Project Quality Assurance Plan [LPM-55], Software Development Plan [LSE-16], Camera Quality implementation Plan [LCA-227], System Engineering Management Plan [LSE-17], and the Operations Plan [LPM-73].
This document describes the data products and processing services to be delivered by the Large Synoptic Survey Telescope (LSST).
LSST will deliver three levels of data products and services. PROMPT (Level 1) data products will include images, difference images, catalogs of sources and objects detected in difference images, and catalogs of Solar System objects. Their primary purpose is to enable rapid follow-up of time-domain events. DATA RELEASE (Level 2) data products will include well calibrated single-epoch images, deep coadds, and catalogs of objects, sources, and forced sources, enabling static sky and precision time-domain science. The SCIENCE PLATFORM will allow for the creation of USER GENERATED (Level 3) data products and will enable science cases that greatly benefit from co-location of user processing and/or data within the LSST Archive Center. LSST will also devote 10% of observing time to programs with special cadence. Their data products will be created using the same software and hardware as Prompt (Level 1) and Data Release (Level 2) products. All data products will be made available using user-friendly databases and web services. Note, prior to 2018 Data products were referred to as Level 1, Level 2, and Level 3; this nomenclature was updated in 2018 to Prompt Products, Data Release Products and User Generated Products respectively [LPM-231]. In the abstract of this document both nomenclatures are used but throughout the remainder of this document only the new terminology is used. In other project and requirements documentation, the old terminology will likely persist.
For discussion about better specifying Operations Rehearsals.
This document describes the detailed acceptance test specification for the LSST Data Management System.
This is the charge for Data Management QA Strategy Working Group, to be convened in April 2018.
Use Cases written by the Butler Working Group covering data discovery, data storage, and data retrieval.
This document describes the Chilean DAC.
This document describes release management at a high level and specific features for upcoming releases.
This document describes the design of the LSST Science Platform, the primary user-facing interface of the LSST Data Management System.
This document describes the detailed test specification for the LSST Science Platform.
This document describes the detailed test specification for the LSST DM Raw Image Archiving Service. This is a specific DM test, and will grow as more tests are needed for the entire environment. This includes two individual tests for the overall raw image creation and ingest into the permanent record of the survey.
This document describes the detailed test specification for the LSST Level 2 System.
This document describes the detailed test specification for the LSST Level 1 System.
This is the Test Plan for Data Management. In it we define terms associated with testing and further test specifications for specific items.
This document specifies the taxonomy of documentation that Data Management produces and policies for their production, management and delivery.
This management plan covers the organization and management of the Data Management (DM) subsystem during the development, construction, and commissioning of LSST. It sets out DM goals and lays out the management organization roles and responsibilities to achieve them. It provides a high level overview of DM architecture, products and processes. It provides a structured starting point for understanding DM and pointers to further documentation.
This document describes the operational concepts for the emerging LSST Data Facility, which will operate the system that will be delivered by the LSST construction project. The services will be incrementally deployed and operated by the construction project as part of verification and validation activities within the construction project.
The LSST middleware is designed to isolate scientific application pipelines and payloads, including the Alert Production, Data Release Production, Calibration Products Productions, and science user pipelines executed within the LSST Science Platform, from details of the underlying hardware and system software. It enables flexible reuse of the same code in multiple environments ranging from offline laptops to shared-memory multiprocessors to grid-accessed clusters, with a common I/O and logging model. It ensures that key scientific and deployment parameters controlling execution can be easily modified without changing code but also with full provenance to understand what environment and parameters were used to produce any dataset. It provides flexible, high-performance, low-overhead persistence and retrieval of datasets with data repositories and formats selected by external parameters rather than hard-coding. Middleware services enable efficient, managed replication of data over both wide area networks and local area networks.
The LSST Science Requirements Document (the LSST SRD) specifies a set of data product guidelines, designed to support science goals envisioned to be enabled by the LSST observing program. Following these guidlines, the details of these data products have been described in the LSST Data Products Definition Document (DPDD), and captured in a formal flow-down from the SRDvia the LSST System Requirements (LSR), Observatory System Specifications (OSS), to the Data Management System Requirements (DMSR). The LSST Data Management subsystem’s responsibilities include the design, implementation, deployment and execution of software pipelines necessary to generate these data products. This document describes the design of the scientific aspects of those pipelines.
The LSST Data Management System (DMS) is a set of services employing a variety of software components running on computational and networking infrastructure that combine to deliver science data products to the observatory’s users and support observatory operations. This document describes the components, their service instances, and their deployment environments as well as the interfaces among them, the rest of the LSST system, and the outside world.
This document discusses the LSST database system architecture.
This is the test report for LDM-503-5 (Alert Distribution Validation), an LSST DM level 1 milestone pertaining to the LSST Alert Distribution System.
This is the test report for LDM-503-3 (Alert Generation), an LSST DM level 2 milestone pertaining to the LSST Alert Production System.
This is the test report for LDM-503-1 (WISE Data Loaded in PDAC), an LSST DM level 2 milestone pertaining to the LSST Science Platform, with tests performed according to LSP-00, Portal and API Aspect Deployment of a Wide-Area Dataset.
This is the test report for LDM-503-2 (HSC Reprocessing), an LSST DM level 2 milestone pertaining to the LSST Level 2 System.
Preliminary design of the LSST Science Platform Authentication system
We describe the design of the LSST Alert Distribution System, which provides rapid dissemination of alerts as well as simple filtering.
How the AP pipeline is invoked and how it communicates with the Butler and Prompt Products Database.
Planning out datatests for regular performance monitor of the Science Pipelines from CI through large-scale performance reports.
Outline and describe the implementation of the DAX web services, and how they interact with each other and the rest of the DM system and SUIT portal.
Notes on running the Stack using Singularity
This document summarizes the status and procedures of the HSC data reprocessing campaigns done by LDF as of early Fall 2018 cycle.
A set of candidate use cases for a next-to-the-database processing systsem.
Notes on aspects of the Kubernetes Cluster
A proposal for provenance handling from 2016
The EFD is a powerful tool for correlating pipelines behavior with the state of the observatory. Since it contains the logging information published to the service abstraction layer (SAL) for all sensors and devices on the summit, it is a one stop shop for looking up the state of the observatory at a given moment. The expectation is that the science pipelines validation, science verification, and commissioning teams will all need, at one time or another, to get information like this for measuring the sensitivity of various parts of the system on observatory state: e.g. temperature, dome orientation, wind speed and direction, gravity vector. This leads to the further expectation that the various DM and commissioning teams will want to work with a version of the EFD that is accessed like a traditional relational database, possibly with linkages between individual exposures and specific pieces of observatory state. This implies the need for another version of the EFD (called the DM-EFD here) that has had transforms applied to the raw EFD that make it more immediately applicable to questions the DM team will want to ask.
An outline of how to run a pipeline for alert packaging, streaming, filtering, and consuming.
LSST images will be contaminated with transient artifacts, such as optical ghosts, satellite trails, and cosmic rays, and with transient astronomical sources, such as asteroid ephemerides. We developed and tested an algorithm to find and reject these artifacts during coaddition, in order to produce clean coadds to be used for deep detection and preliminary object characterization. This algorithm, CompareWarpAssembleCoadd, uses the time-series of PSF-matched warped images to identify transient artifacts. It detects artifact candidates on the image differences between each PSF-matched warp and a static sky model. These artifact candidates include both true transient artifacts and difference-image false positives such as difficult-subtract-sources and variable sources such as stars and quasars. We use the feature that true transients appear at a given position in the difference images in only a small fraction (configurable) of visits, whereas variable sources and difficult-to-subtract sources appear in most difference images. In this report, we present a description of the method and an evaluation using Hyper SuprimeCam PDR1 data.
Notes and recommendations of an investigation into third-party tools for consolidating system deployment and management across LSST enclaves and physical sites
We quantify the performance of the LSST pipeline for processing crowded fields, using images obtained from DECam and comparing to a specialized crowded field analysis performed as part of the DECAPS survey.
Considering single-visit LSST depth, in an example field of roughly the highest density seen in the LSST Wide-Fast-Deep area, DECAPS detects 200 000 sources per sq. deg. to a limiting depth of 23rd magnitude. At this source density the mean LSST-DECAPS completeness between 18th and 20th mag is 80%, and it drops to 50% at 21.5 mag.
For fields inside the Galactic plane cadence zone, source density rapidly increases. For instance, in a field in which DECAPS detects 500 000 sources per sq. deg. (5σ depth of 23.2), the mean completeness between 18th and 20th mag is 78%, and it drops to 50% at 20.2 mag.
In terms of photometric repeatability, above 19th mag LSST and DECAPS are in a systematics-dominated regime, and there is only a slow dependence on source density. At fainter magnitudes, the scatter between LSST and DECAPS is less than the uncertainty from photon noise for source densities up to 100 000 per sq. deg, but the scatter grows to twice the photon noise at densities of 300 000 per sq. deg. and above.
For repeat measurements of the same field with LSST, the astrometric scatter per source is at the level of 10-30 milliarcseconds for bright stars (g < 19), and is not strongly dependent on stellar crowdedness.
Design note on the possibilities for the Internet-facing endpoints of the LSST Science Platform deployments.
Design sketch and mathematics for a new approach to building coadds and constraining their inputs.
This document will:
Describe the current status of “” tools, in the broadest sense, currently provided by Data Management;
Sketch out a set of common use cases and requirements for future QA tool and service development across the subsystem.
It is intended to serve as input to planning for currently being undertaken by the DM Leadership Team, the DM System Science Team, and the DM QA Strategy Working Group (LDM-622).
Documentation for the SQL schema that will be used to manage datasets in the Gen3 Butler.
Explanation of how to prepare a cluster for Kubernetes
The goals of this Summer 2014 (S14) task were to understand the scope of the differential chromatic refraction (DCR) issue using a realistic range of stellar spectral energy distributions (SEDs).
The goals of this Winter 2014 (W14) task are to investigate the effects of differential chromatic refraction (DCR) on the rate of false positives in image differences.
We report on the investigation into the use of lossy compression algorithms on LSST images that otherwise could not be stored for general retrieval and use by scientists. We find that modest quantization of images coupled with lossless compression algorithms can provide a factor of ∼6 savings in storage space while still providing images useful for followup scientific investigations. Given that this is only means that some products could be made quickly available to users and would free resources for community ues that would otherwise be necessary to re-compute these products, we recommend that LSST consider using a lossy compression to archive and serve image products where appropriate.
How catalog content and metadata is and will be handled
Pipeline driver tasks such as singleFrameDriver, coaddDriver, and multiBandDriver have been tested to see what their memory usage is. This technote will detail how these memory tests were ran and what results were found.
This document provides an in-depth description of the role of the LSST Project in preparing software and providing computational resources to process the data from Special Programs (deep drilling fields and/or mini-surveys). The plans and description in this document flow down from the requirements in LSE-61 regarding processing for Special Programs. The main target audience is the LSST Data Management (DM) team, but members of the community who are preparing white papers on science-driven observing strategies may also find this document useful. The potential diversity of data from Special Programs is summarized, including boundaries imposed by technical limitations of the LSST hardware. The capability of the planned Data Management system to processes this diversity of Special Programs data is the main focus of this document. Case studies are provided as examples of how the LSST software and/or user-generated pipelines may be combined to process the data from Special Programs.
We develop a model for the HSC optical PSF through analysis of out-of-focus "donut" images.
We use version 13.0.9 of the LSST DM stack to reduce optical R band data taken with the KPNO 4-meter telescope for the Deep Lens Survey (DLS). Because this data set achieves an LSST like depth and has been studied and characterized exhaustively over the past decade, it provides an ideal setting to evaluate the performance of the LSST DM stack. In this report we examine registration, WCS fitting, and image co-addition of DLS data with the LSST DM stack. Aside from using a customized Instrument Signature Removal package, we are successful in using the DM stack to process imaging data of a 40 x 40 square arcminute subset of the DLS data, ultimately creating a coadd image. We find the astrometric solutions on individual chips have typical errors <15 miliarcseconds, demonstrating the effectiveness of the Jointcal package. Indeed, our findings in this regard on the DLS data are consistent with similar investigations on HSC and CFHT data.
A closer look at the astrometry data set shows it contains larger errors in Right Ascension than Declination. Further examination indicates these errors are likely due to a guider problem with the telescope, and not the result of proper motions of stars, or a problem with the DM stack itself.
Finally, we produce a coadd using the reduced data. Our coadd is approximately 40 square arcminutes-much larger than the coadds typically created with the stack. Creating a large image stretched our machines to their limits, and we believe a dearth of system resources lead to coadd creation Task not finishing. In spite of this, the coadd produced by the stack is of comparable quality to its counterpart produced by the DLS team in previous analysis in terms of depth, and ability to remove artifacts which do not correspond to true astrophysical objects. However issues were encountered with SafeClip.
A look at using OpenShift in LSST
State of image subtraction in the LSST stack
Initial installation, configuration, testing of Rucio server, RSEs and clients.
Try running ci_HSC pipeline using DESDM Framework
Design document for the DM Header Service.
Design sketches to support a feasibility study for using lsst.verify to instrument Tasks.
For now, a WIP scratch pad for new Butler APIs.
Report on the SuperTask design as emerging from the 2017 SuperTask Working Group activit.
Rules adopted by the AP pipeline to organize verification code throughout the Stack.
In this note we present some aspects of the observed I/O behavior of the command line tasks ingestImages.py and processCcd.py when used for processing HSC data and the issues the current implementation may raise for processing data at the scale needed for LSST.
Report on the delivery, installation, and initial use of an LSST Camera data acquisition (DAQ) test stand at NCSA, July 18-20, 2017. Includes notes from a discussion of future plans for DAQ work that was held following the installation.
This is a descriptive and explanatory document, not a normative document. This document explains the proposed baseline as presented in the DM replan in July, 2017, referred to just “baseline” in the prose that follows.
How the Engineering and Facilities Database will be transformed, maintained, and used within DM
The purpose of this document is to begin to assemble the diversity of motivations driving the inclusion of photometric redshifts in the LSST Level 2 Object Catalog, and prepare to make a decision on what kind of photo-z products will be used. The roadmap for this process is described in Section [sec:intro]. We consider the photo-z use-cases in order to validate that the type of photo-z incorporated into the Level 2 DRP catalog, and the format in which it is stored, meets the needs of both DM and the community. We also compile potential evaluation methods for photo-z algorithms, and demonstrate these options by applying them to the photo-z results of two off-the-shelf photo-z estimators. The long-term plan is for this document to develop over time and eventually describe the decision-making process and the details of the selected algorithm(s) and products. PRELIMINARY RECOMMENDATIONS CAN BE FOUND IN SECTION [SEC:INTRO].
Tests and experiments performed during the Qserv prototyping phase.
Tests performed with InfiniDB in late 2010. Testing involved executing the most complex queries such as near neighbor on 1 billion row USNOB catalog
Technologies that were investigated whilst determining the best approach for LSST
Summary of state of the art PSF estimation tools and their suitability for the LSST alert pipeline.
This attempts to summarise the debate around, and suggest a path forward, for LSST software releases. Although some recommendations are made, they are intended to serve as the basis of discussion, rather than as a complete solution.
This material is based on discussions with several team members over a considerable period. Errors are to be expected; apologies are extended; corrections are welcome.
Describes the new design for afw::math::Statistics
This document describes in detail the process of creating workflows for Batch Prodcustion Services.
Design of the LSST Camera Geometry (“camGeom”) system
A closer look at Pegasus WMS
This note describes work done for DM-7295. It includes instructions for using the LSST Stack to process a set of raw DECam images from ISR through Difference Imaging.
Most LSST objects will overlap one or more of its neighbors enough to affect naive measurements of their properties. One of the major challenges in the deep processing pipeline will be measuring these sources in a way that corrects for and/or characterizes the effect of these blends.
A description of the algorithm to calculate a model of the true sky and forward model atmospheric effects to create matched templates..
A description of the jointcal algorithm, for performing simultaneous astrometry and photometry for thousands of exposures with large CCD mosaics.
Winter 2013 LSST DM Data Challenge Release Notes
Summer 2012 LSST DM Data Challenge
Notes regarding Kubernetes and comparison with other container managers
Qserv data placement and replication strategies
The current reference catalog matcher used by LSST for astrometry has be found to not be adequately robust and fails to find matches on serveral current datasets. This document describes a potential replacement algorithm, and compares its performance with the current implementation.
A design discussion and implementation plan for the pipelines.lsst.io documentation project, including information design and topic templates.
The document explains a procedure for loading the SDSS/Stripe82 catalogs into PDAC Qserv. The catalogs were produced in a course of the LSST Summer 2013 Data Challenge effort.
LSST alert distribution
Some steps to help you rename an LSST git repository with minimal disruption to other's builds.
This document describes how to wrap an LSST package with pybind11. It does this by following the process of wrapping a single header file in afw step-by-step.
A preliminary evaluation of selected workflow management systems
This document describes coding guidelines for using pybind11 within LSST.
A brief tutorial on using the main LSST command-line processing tasks.
Propose track to improve container infrastructure for Qserv
A short description of this document
This is the DM guide for T/CAMs implementing the earned value system.
I use the StarFast simulator to generate many simulated observations of a field at a range of airmasses from 1.0 to 2.0, and at several LSST bands. After differencing each image from the observation in each band closest to zenith, I generate a metric to characterize the number and size of dipoles in the residual.
Summary of the current understanding of L1 database design issues.
A high-level description of SQuaRE's delivery plan
A glossary of different kinds of coadded images, with brief descriptions of the algorithms behind them.
This document describes how C++ classes can be wrapped using pybind11.
This document describes how C++ classes can be wrapped using Cython.
A simple but fast and accurate simulation tool to generate large images with many realistic stars for testing algorithms.
Writeup of work done in Summer 2015 Winter 2016 to test for shear bias in measurements done using CModel and ShapeletPsfApprox from the DM stack. Tests were done on galaxies of known shape in the style of great3sims using constant shear. Psfs applied were produced by PhoSim.
The future World Coordinate System and distortion model requirements of the LSST software stack, and a summary of currently available options from the community.
Testing full scan performance for vertical-partition joins of multiple tables
An analysis of the dipole measurement task as currently applied in the LSST image differencing pipeline.
An analysis of the false positives in Decam imaging data processed by the LSST pipeline.
This document describes our current World Coordinate System usage and implementation, in preparation for either a significant refactor or complete reimplementation using another public library.
Enabling debug options for developers inside docker containers
Notes about how version 1.0 of the AP simulator works, and its components
Notes on how to get started with SuperTask, Activator and Workflows
Dealing with System Integrity Protection on OS X El Capitan
A description of the DMTN series.
This technote illustrates how to use the Honeycomb APIs to send events, create Markers and configure alerts getting data from the SQuaSH API.
The purpose of this technote is to describe the software components that will produce the Data Quality report for the Prompt Processing Pipeline.
This is a short technote to describe the various features of the notebook aspect (hereafter JupyterLab).
The purpose of this technote is to collect best practices on using the Holoviews/Bokeh/Datashader stack for creating data visualizations across LSST.
The notebook-based test report system provides a way for LSST to generate and publish data-driven reports with an automated system. This technote describes the technical design behind the notebook-based report system.
The purpose of this technote is to demonstrate how the Bokeh Models API can be used to create new charts, in particular we are interested in metric visualization for the SQuaSH monitoring dashboard.
The JupyterLab environment is becoming a powerful tool for all sorts of tasks that LSST team members commonly undertake. Data exploration and analysis are obvious cases where the distributed notebook environment is useful. This notebook will show how to use the notebook environment in conjunction with other aspects of JupyterLab: shell access and git authentication, to produce a meaningful development workflow.
This technote explores how JSON-LD (Linked Data) can be used to describe a variety of LSST project artifacts, including source code and documents. We provide specific examples using standard vocabularies (http://schema.org and CodeMeta) and explore whether custom terms are needed to support LSST use cases.
This technote describes, in a tutorial style, the lsst.verify API. This Verification Framework enables the LSST organization to define performance metrics, measure those metrics in Pipeline code, export metrics to a monitoring dashboard, and test performance against specifications.
A frequently updated note on SQuaRE’s JupyterLab prototyping
This technote describes the design of the metric and specification system that the new validation framework will use.
SQuaRE instructions for making official releases
A guide to writing microservices that will live behind api.lsst.codes and are intended for automated consumption
Instructions for porting LSST DM software to support both Python 2 and 3
Research and design of a documentation metadata database and API for LSST based on JSON-LD metadata.
Instructions on migrating LSST unittest tests to py.test
Overview of LSST DM's communication and documentation platforms, including: chat, ticketing, forums, technical notes and software documentation.
Overview of SQuaRE services
The SQuaSH dashboard is used to monitor the KPM metrics computed by the LSST verification framework, here we describe its design and implementation details.
SQUASH QA database design
SQuaRE's Logging, monitoring and metrics system
LSST the Docs is a platform for publishing documentation websites. Through a microservice architecture, LSST the Docs is capable of building even complex multi-repository documentation projects and publishing many versions concurrently on the web.
This is a SQuaRE Technical Note describing the implementation of the LSST Publication Board process in JIRA.
User and operator documentaton for the Conference & Meeting Proceedings of the American Astronomical Society
Prototype binary science pipeline software distribution methods.
SQR-000: The LSST DM Technical Note Publishing Platform
Evaluating the SRD survey strategy metrics
Minimoon detections in LSST: MAF and MOPS investigation
Checking if using calibration stars from GAIA is adequate to replace an ubercal procedure.
Provide information about interface requirements for writing a Scheduler that will interface with SOCS.
Description of the data and software used to generate the table of bright stars in the CatSim database.
Using the all-sky camera, see how often it is cloudy and if the current all-sky camera is capable of building transpareny maps.
Calculating trailing losses for moving objects.
This technote documents the various m5 calculation tools and explains the traceability of the underlying data/throughput curves.
Creating moving object inputs for simulations,.
Reserve space for end-to-end overview of sims