- Course overview
- Search within this course
- Overview of key IMPC concepts and tools
- Introduction to the Solr API: accessing IMPC data programmatically
- What is Apache Solr?
- Important definitions: query, field, core, document, parameter
- Quiz 2: get yourself familiar with Solr terminology
- What is the difference between an IMPC parameter and a Solr parameter?
- Using simple Solr syntax in your browser
- Output of the simplest request in your browser
- A Python module to access IMPC data: installation and available functions
- Quiz 3: explain Solr request
- Filtering data in Solr: narrowing down your results
- How to query a specific field: filter by value
- Exercise 4: filtering by a single field
- How to filter numbers: range search
- Exercise 5: changing the p-value threshold
- How to combine multiple filters: Boolean operators
- Exercise 6: applying multiple filters
- How to exclude data: NOT operator
- Why parentheses are important: combine multiple Boolean operators
- Quiz 5: Boolean operators
- How to handle with null values: exclude empty fields
- Exercise 7: explore null values
- Downloading data: getting large results efficiently
- How to download large dataset effectively: pagination
- How to download the data: batch_solr_request function
- What formats are available for downloading: wt parameter
- Exercise 8: download the data
- What is the difference: JSON vs CSV
- What you need to keep in mind: query responsibly
- Quiz 6: request only necessary data
- Advanced Solr query techniques: faceting and iterating over entities
- Understanding IMPC data: resources and assistance
- Your feedback
How to get specific fields: fl parameter
To make the query more effective and retrieve only the needed data, the fl (field list) parameter can be used. It limits the information included in a query response to a specified list of fields.
Information about available IMPC fields for each core is available here.
Warning! If you misspell a field name, it will be silently removed from the output. You can turn on the validate flag, which returns a warning if the spelling is not present in the list of available fields.
num_found, df = solr_request(
core='statistical-result',
params={
'q': '*:*',
'fl': 'marker_symbol,top_level_mp_term_name,effect_size,p-value',
'rows': 3
},
validate=True
)
In the example above, the validate flag was set to True, and three documents from the statistical-result core were requested. Only four fields were specified:
- marker_symbol
- top_level_mp_term_name
- effect_size
- p-value