PyCon JP 2025 Logo
広島国際会議場
JPEN
Nicholas Dwiarto

Nicholas Dwiarto

Pythonic Finance: Analyze Company Fundamentals with SEC EDGAR APIs

ダリア2EN
02:15 - 02:4530min
DAY 1
09/26
FRI

In this talk, Python and quantitative methods are used to access, validate, and analyze fundamental financial data from the US Securities and Exchange Commission (SEC) EDGAR system. The SEC's JSON API provides structured financial data, derived from company filings reported in eXtensible Business Reporting Language (XBRL), an international standard for financial reporting. Pydantic is used for robust data validation. Attendees will learn to:

  • Fetch basic metrics
  • Calculate financial ratios
  • Visualize trends
  • Navigate common data challenges

While focused on the US market data, a brief explanation of the international landscape will also be provided. No finance background required. Basic Python is required to understand the data processing part.


トーク詳細 / Description

Target

  • Who: This talk is designed for anyone interested in using Python to understand public company financials, spanning from students, programmers, hobbyists, even experienced working professionals. As mentioned, no finance background is required for this talk. Beginner-level Python (variables, functions, lists, using libraries) might be required to understand how the financial data is processed to output metrics.

  • What: Attendees will be able to discover briefly about the XBRL language, hands-on techniques for getting data from SEC EDGAR's companyfacts API, combining data science and software engineering to validate data with Pydantic, extract key fundamental metrics from the API responses, calculate basic financial ratios, visualize financial trends, and learn the nuances of working with public financial data APIs, even internationally.

  • How: This talk will be presented in a mixed-style of core financial concepts with live (fallback is prepared in case network errors) Python demonstrations in a Jupyter Notebook / Google Colaboratory. This talk will go through the entire process: selecting a company to visualize its financial health, techniques to ensure that the data is valid, and calculating financial metrics and ratios.

Scope

  • I am NOT a financial advisor and this talk is for educational purposes only, I will NOT promote or recommend any specific assets, companies, products, and strategies, and this talk is NOT to be construed as any financial, investment, or trading advice
  • Advanced financial modeling, company valuations, and comprehensive ratio analysis are outside of the talk's scope
  • This talk does not dive deep with parsing XBRL format
  • This talk does not cover how to buy and/or sell securities
  • All examples of this talk use historical data, past performance is not indicative of future performance

Outline

Planned outline of the presentation:

  • Introduction and Disclaimer (~3 minutes)
    • Quick self-introduction and an important disclaimer that I am not a financial advisor and this talk is not a financial advice
    • Why analyze company fundamentals with Python?
  • The Data: SEC EDGAR, XBRL, and APIs (~4 minutes)
    • Overview of SEC, EDGAR system, and U.S. company filings (10-K)
    • Brief introduction to XBRL as the structured data standard
    • Focus on the companyfacts JSON API for XBRL-derived data
  • Real-Life Scenario: Ensuring Data Quality with Pydantic (~3 minutes)
    • Why Pydantic and a quick overview of the Pydantic models for the API response
  • Fetching and Validating Company Data (~3 minutes)
    • Use requests and pydantic to get and validate data for a sample U.S. company
    • Handle potential API or validation issues
  • Finance Metrics Explanation & Extraction (~7 minutes)
    • Explanation of basic, core finance metrics: Revenue, Net Income, Assets, Liabilities, Equity
    • Extraction of the annual data, building the pandas's DataFrame to showcase the data
    • Adapting to different XBRL tags (different derived JSON property) for the same financial concept, companies might not have the same schema
  • Calculation of Finance Ratios (~3 minutes)
    • Explanation of finance ratios: Net Profit Margin, Debt to Equity Ratio
    • Calculation of the metrics
  • Visualizing Trends (~2 minutes)
    • From the metrics and calculations, generating and showcasing plots with matplotlib / seaborn
  • Internationalization and Data Nuances (~2 minutes)
    • How other countries (example: Japan has EDINET) has a different system, but with the same XBRL data structure, proving the skills and knowledge are transferrable
  • Key Takeaways, Recap, Closing (~2 minutes)
    • Summary of the process, the tools, suggestions for future exploration
Nicholas Dwiarto

Nicholas Dwiarto

プロフィール

Nicholas is a software engineer based in Japan. Originally hailing from Indonesia, he spends his weekends exploring Tokyo's neighborhoods, hiking local mountains, and reading articles about tech, finance, or anything that sparks his curiosity. Nicholas is passionate about building things that makes his life a bit easier, and he's always up for good views of the city.