The full solution for Smart EDGAR is provided as Docker image
The REST service provides the content of 10-K and 10-Q Edgar xbrl filings. Below you can find the complete docker-compose.yml file for the solution. Just start the application with docker-compose up.
When you start the application the first time, a complete initial load is started: Because of the big size of the data this is taking a very long time. After the initial load has completed we just download the latest changes every 60 minutes (see timer)
version: '3.0'
services:
edgar-db:
image: postgres:alpine
container_name: db-edgar
restart: always
environment:
- TZ=Europe/Zurich
- POSTGRES_USER=edgar
- POSTGRES_PASSWORD=tbd
volumes:
- /data/SmartEdgar/postgresql/data:/var/lib/postgresql/data
ports:
- 5432:5432
edgar-service:
image: pschatzmann/smart-edgar
container_name: edgar-db
environment:
- xmx=2000m
- TZ=Europe/Zurich
- jdbcDriver=org.postgresql.Driver
- jdbcURL=jdbc:postgresql://edgar-db:5432/edgar
- jdbcUser=edgar
- jdbcPassword=tbd
- destinationFolder=/usr/local/bin/SmartEdgar/data
links:
- edgar-db
volumes:
- /data/SmartEdgar:/usr/local/bin/SmartEdgar/data/
ports:
- "9997:9997"
edgar-load:
image: pschatzmann/smart-edgar
environment:
- xmx=500m
- TZ=Europe/Zurich
- formsRegex=10-K.*|10-Q.*
- timer=60
- history=false
- jdbcDriver=org.postgresql.Driver
- jdbcURL=jdbc:postgresql://edgar-db:5432/edgar
- jdbcUser=edgar
- jdbcPassword=tbd
links:
- edgar-db
command:
- ./start.sh
- ch.pschatzmann.edgar.dataload.DownloadProcessorJDBC
volumes:
- /data/SmartEdgar:/usr/local/bin/SmartEdgar/data/
You can access the REST functionality at http://localhost:9997 in your web browser. Replace localhost with the hostname if you want to access the solution from a different machine.. This takes you to the Swagger UI that you can use to play around with the web services:
Initial Setup (XBRL Download)
We provide the functionality so that you can automatically download the latest relevant XBRL files. Each filing is stored in a zip file independently of the fact if EDGAR provides zip files (new filings) or individual xml and xsl files (old filings). Here you find the necessary information if you do not want to rely on the default logic or if you want to force a reload.
We support the following data load scenarios:
- download the XBRL file and load it into a Postgres Database (DownloadProcessorJDBC)
- download the XBRL files only (DownloadProcessorXbrlFile)
Environment Variables
We recommend to download all information once and subsequently only retrieve the changes. This can be achieved with the help of the following environment variables
- history
- True: determine all available filings inEDGAR
- False:determine only the latest EDGAR filings
- <empty value>: The system returns false only if a complete load (database) has completed.
- timer
- time interval in minutes in which the download is repeated
- If the value is empty the functionality is executed only once
Delta Logic
- We load a filing from EDGAR only if it does not exist in our file system
- We load a filing into the Database only if it has not been loaded yet
Download of Files into Database
Here is the example to force a complete initial data load of all files into the database.
version: '3.0'
services:
edgar-db:
image: postgres:alpine
container_name: db-edgar
restart: always
environment:
- TZ=Europe/Zurich
- POSTGRES_USER=edgar
- POSTGRES_PASSWORD=tbd
volumes:
- /data/SmartEdgar/postgresql/data:/var/lib/postgresql/data
ports:
- 5432:5432
edgar-load:
image: pschatzmann/smart-edgar
environment:
- xmx=500m
- TZ=Europe/Zurich
- formsRegex=10-K.*|10-Q.*
- timer=60
- history=true
- jdbcDriver=org.postgresql.Driver
- jdbcURL=jdbc:postgresql://edgar-db:5432/edgar
- jdbcUser=edgar
- jdbcPassword=tbd
links:
- edgar-db
command:
- ./start.sh
- ch.pschatzmann.edgar.dataload.DownloadProcessorJDBC
volumes:
- /data/SmartEdgar:/usr/local/bin/SmartEdgar/data/
Download of Files (only)
.Here is the example for the first initial data load of all xbrl zip files without loading them into the database
version: '3.0'
services:
edgar-files:
image: pschatzmann/smart-edgar
environment:
- xmx=500m
- TZ=Europe/Zurich
- formsRegex=10-K.*|10-Q.*
- history=true
links:
- edgar-db
command:
- ./start.sh
- ch.pschatzmann.edgar.dataload.DownloadProcessorXbrlFile
volumes:
- /data/SmartEdgar:/usr/local/bin/SmartEdgar/data/
Docker Environment Variables
Here is the list of all supported environment variables
Environment Variable | Default Value | Description |
---|---|---|
destinationFolder | /usr/local/bin/SmartEdgar/data/ | Data directory which is used to store and access the xbrl zip files |
timer | number of minutes to wait before repeating the next data load | |
history | true if the initial load has never completed
false if the initial load has completed |
Load historic data from EDGAR. Set the required value to override the default logic |
formsRegex | 10-Q.*|10-K.* | Regex which selects the forms to be loaded |
jdbcDriver | org.postgresql.Driver | Postgres jdbc driver |
jdbcUser | userid to access the database | |
jdbcPassword | password to access the database | |
jdbcURL | jdbc:postgresql://nuc.local:5432/edgar | jdbc url to access the database |
typeString | VARCHAR(1000) | default sql datatype for strings |
typeNumber | DECIMAL(20,2) | default sql datatype for numbers |
typeDate | DATE | default sql datatype for dates |
minPeriod | 2005-04 | Starting period for data load |
xmx | 3000m | xmx java memory setting |
Further Information
Further information can be found in my posts
2 Comments
Daniel · 16. January 2019 at 5:56
Phil, Thanks for sharing your project. When I run Smart Docker Image using the YAML file at the top of this post, I get the following error. Any ideas what’s going wrong?
ERROR: Service ‘edgar-service’ has a link to service ‘smart-edgar-db’ which is undefined.
Thanks!
pschatzmann · 16. January 2019 at 7:56
Hallo,
I am not sure what the issue exactly is: There were issues with the indentation and the user needs to be set to edgar.
I have updated the document and confirmed that a copy-pasted version of docker-compose.yml is working now.
I recommend to change the password und set the volumes mapping to a directory that makes sense for you.
Please let me know if you still have issues.
Kind regards
Phil