Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Open Source in Real Life

Open Source in Real Life

Avatar for Ana Schwendler

Ana Schwendler

September 24, 2019
Tweet

More Decks by Ana Schwendler

Other Decks in Programming

Transcript

  1. WHAT IS SERENATA? • The main goal: use artificial intelligence

    to social control of public administration • We learnt how to work with data science using open data (CSVs that show reimbursements). • Multidisciplinary team: Scientists, programers, marketing and journalists • Open Source: More than 700 members in the Telegram group.
  2. WHY? • Advantages: Bringing citizens and government closer, suggesting technology

    solutions • For the developer: tool choice flexibility
  3. • We did a crowdfunding campaign that would pay 3

    months of development • Data science projects usually take 6 months to a year, what can we do in 3 months? • Techniques: hypothesis driven development and timeboxing HOW DO WE GET HERE?
  4. • Hypothesis-Driven Development • Survey of hypotheses that seek the

    solution of a problem • Multidisciplinary team as a way to expand knowledge HDD: HYPOTHESES
  5. • List of hypotheses to explore • Associate a time

    window with development, and if it doesn't work, switch to another hypothesis • Back to previous assumptions as time goes by TIMEBOXING
  6. • We studied the available dataset, and by that we

    defined some hypothesis we could have: ◦ Non-Standard Prices on Food ◦ Traveled distance and spending ◦ Invalid tax identification number ◦ Monthly maximums (taxi, fuel, ...) DEVELOPED HYPOTHESES
  7. • Jupyter notebook with initial analysis • Script for parsing

    the entire database • Training an initial model • Retraining after time period DEVELOPMENT CYCLE