Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Data Governance and Personal Data Protection Ac...

Fiqry Revadiansyah
April 01, 2025
2

Data Governance and Personal Data Protection Act at Paper.id

Speakership webinar at Evermos Indonesia (2025-03-20) about data governance and personal data protection act (UU no 27/2022 Indonesia) implementation at Paper.id

Topics: Data Engineering, Data Science, Artificial Intelligence, Analytics Engineer, Data Steward, Data Governance Staff

Fiqry Revadiansyah

April 01, 2025
Tweet

Transcript

  1. A data professional with 6+ years experience in Data Science,

    Analytics, and Engineering, currently leading AI and data innovation at Paper.id. He has cultivated a diverse skill set across sectors such as finance and consulting, seamlessly integrating data insights and automation into product development, business strategies, and engineering solutions. Beyond his professional role, he has actively contributed to over 20+ national and international speaking experiences and serves as an advisory board member for Statistics and Data Science at Universitas Padjadjaran. Speakership Portfolio: Webinar Series TGIF #1 Fiqry Revadiansyah Div. Lead Data Science and Engineering Copyright@2025 Paper.id - Confidential & Proprietary
  2. UU PDP - Understanding the Framework 01 02 Data Governance

    at Paper.id What will we discuss during this Sharing Session “ “ 03 Paper.id Story on Data Governance Journey Webinar Series TGIF #1 Copyright@2025 Paper.id - Confidential & Proprietary
  3. Our Core Capabilities, known as Trifecta In B2B transaction, payment

    and financing is tied in to invoicing that act as a basis. Full stack invoicing and document exchange ensure payment and financing made at the right amount at the right term. 4 Copyright@2023 Paper.id - Confidential & Proprietary Easiness of receiving and sending payment with commonly used payment method with a low cost and fast SLA1. Payment Financing Control and access options to shorten payment terms towards buyers and lengthen payment terms from suppliers Invoicing Send, exchange, and track all Invoices and other Business Documents. Paper.id Paper.id is the full-stack B2B Payment Platform that boosts SMEs productivity and helps Suppliers get paid faster and Buyers pay later 1Service Level Agreement
  4. 5 Bridging the Transaction Between a Supplier and a Buyer

    Paper.id can be used by a supplier to receive payment from their buyer or can be used by a buyer to make payment to their supplier. Copyright@2025 Paper.id - Confidential & Proprietary Supplier Buyer Making payment Receiving invoices from suppliers Make and send invoices Receiving payment Document exchange API integration Automated payment reconciliation Make Payment Receiving Payment Based on the invoice from supplier, Paper.id reconcile payment to invoice once the payment has been made. The information will be directly integrated into accounting and stock management module. Enable a supplier to send invoices with e-meterai via email, WhatsApp, & SMS. By doing this, a seller offer multiple payment method to pay the invoice. Payment reminder and automatic reconciliation are in place.
  5. “Trust in AI, all others must bring human” 6 Data

    Science and Engineering @ Paper.id We build impactful data solutions by blending cutting-edge engineering, AI-driven innovation, and a team-first mindset, grounded in collaboration, ownership, and excellence. Copyright@2025 Paper.id - Confidential & Proprietary “Building Blocks of Data-Driven Intelligence” Vision: Transforming Paper.id into an AI-Native Company Objective: Develop and deploy high-quality AI products across all business units at Paper.id to drive efficiency, innovation, and data-driven decision-making. Functionality: Data Scientist, AI Engineer, AI Product Manager Vision: Building the data foundation that powers Paper.id's intelligent decisions Objective: Build and maintain high-quality, scalable, and reliable data solutions to support all business units at Paper.id in making data-driven decisions and driving innovation. Functionality: Data Engineer, Analytics Engineer
  6. UU PDP Understanding the framework Copyright@2025 Paper.id - Confidential &

    Proprietary UNDANG UNDANG PERLINDUNGAN DATA PRIBADI Webinar Series TGIF #1
  7. 8 UU Perlindungan Data Pribadi Understanding our foundation of data

    analytics, science and engineering Copyright@2025 Paper.id - Confidential & Proprietary Docs Summary What Personal data protection law that regulates the processing of information that can identify individuals directly or indirectly When Enacted October 17, 2022 with a two-year implementation period ending October 2024 Who Applies to all organizations processing Indonesian citizens' data, both within and outside Indonesian territory Where Covers data processing within Indonesia and abroad if affecting Indonesian citizens or legal interests Why Protects individual privacy rights while establishing clear framework for responsible data handling How Through mandatory security measures, data subject rights, breach notifications, and enforcement mechanisms
  8. 9 UU Perlindungan Data Pribadi Data Subject Rights under UU

    PDP Copyright@2025 Paper.id - Confidential & Proprietary Health records Biometrics data Genetic data Criminal records Children’s data Financial information Name Gender Religion Marital status Contact information Other identity data Specific Sensitive Data General Sensitive Data - Information about data collection, purpose, and usage - Access personal data and obtain copies - Delete or destroy personal data - Update, correct, and complete inaccurate data - Data portability User has rights to
  9. 10 UU Perlindungan Data Pribadi What do we need to

    cover for UU PDP Compliance as a Data Team Copyright@2025 Paper.id - Confidential & Proprietary Establish a Data Governance Committee Data Team Legal Team IT Security Business Unit Develop Data Governance Framework Documentation Structure Data Inventory and Classification Implement Encryption Develop the Technical Protection Guide Establish RBAC Setup MFA Setup Audit Logging Data Subject Rights Management Request Handling Procedure Consent Management Develop Incident Response and Recovery Incident Detection and Response Disaster Recovery Setup Monitoring and Compliance Audit Implement Regular Audit Deploy automated monitoring tool Training and Awareness of UU PDP Workshop for related Team Documentation and Guideline
  10. Data Governance at Paper.id How do we “achieve” the unpredictable

    stage at the minimum data team member Copyright@2025 Paper.id - Confidential & Proprietary Webinar Series TGIF #1
  11. 12 Paper.id Data Governance Security Framework and Certification at Paper.id

    Copyright@2025 Paper.id - Confidential & Proprietary ISO 27001 Standard for information security management systems (ISMS), managing sensitive company information including financial data, intellectual property, employee details, and information entrusted by third parties. PCI DSS Payment Card Industry Data Security Standard, a set of security standards designed to ensure that all companies that accept, process, store, or transmit credit card information maintain a secure environment.
  12. 13 Copyright@2025 Paper.id - Confidential & Proprietary Data Layer Security

    ETL Pipeline Security Compliance Monitoring Technical Documentation - Column-level encryption for PII - Row level security in data access - Apply IAM roles for cloud access - Deploy VPC service control - Dashboard governance (PAM) - Use secret management for pipeline credentials - Implement data masking ETL - Implement CI/CD flow checker - Create immutable audit logs - Configure data catalog with PII classification metadata - Implement retention policies at table/dataset level - Create automated discovery scanning for unclassified PII - Maintain data lineage graphs - Document encryption techniques - Disaster recovery for table/dataset - Maintain RBAC matrix mapping with PII access justification UU PDP: Article 5-15, 20-24 about data subject rights & consent management Paper.id Data Governance How do we comply to UU PDP - Comprehensive Documentation Framework UU PDP: Article 35-39 about security controls UU PDP: Article 35-39 about security controls UU PDP: Article 46 about breach notification
  13. 14 Copyright@2025 Paper.id - Confidential & Proprietary Data Source Database

    / API Source Stream (Binlog/WAL) Batch (Extract Load) Data Warehouse Layer 3 Dim & Fact Table** Data Staging Layer 2 Staging Table Data Lake Layer 1 Raw Tabularized Table Data Mart Layer 4 Data Mart/Report Table Transform Analytics Platform Self Service Analytics * AI Platform AI & ML Development* Transform Transform Paper.id Data Governance Implementation - Dashboard Governance - Data Architecture * any dashboard/AI/ML creation must only use the standardized data source (L3 and above) **Separate the PII layer and data warehouse layer, where the PII layer is completely hidden on Metabase/Looker Studio
  14. 15 Copyright@2025 Paper.id - Confidential & Proprietary Paper.id Data Governance

    Implementation - Dashboard Governance - Access and Documentation 1. Our metabase only open the standardized data sources (L3 layer and above), which determined as a single source of truth to create dashboard (Source BigQuery) 2. Every single team has their own folder, nobody from other group can open the other team’s folder / data. For further request, need to submit a Slack Workflow first. 3. Every single data source (L3 & above), stored in Paper Data Model as the only source for stakeholder do their self-service analysis, where data team already completed the metadata of table.
  15. 16 Copyright@2025 Paper.id - Confidential & Proprietary Paper.id Data Governance

    Implementation - PAM (Privilege Access Management) Implementation - Workflow Access BU Folder Start Don’t Have Access (Folder is Hidden) Fill the Access Form on Slack (wait for approval) Find a Specific Data Open Dashboard / Question Information still Incomplete Ask Data Member (ticket request) Analysis the Data Finish Dashboard / Question is Published Dashboard / Question is Simple Dashboard / Question is Too Complex Create by Yourself Have Access to BU Folder Fill the dashboard management info *VPN need to be activated while operating PaperSPARK
  16. 17 Copyright@2025 Paper.id - Confidential & Proprietary Paper.id Data Governance

    Implementation - PAM (Privilege Access Management) Implementation - Documentation User fill the Access Request on Slack Workflow On-call member evaluate the request On-call member Approve the request Clickup / Jira ticket automatically created On-call member grant the access On-call member ask for confirmation Agentic AI in dev
  17. 18 Copyright@2025 Paper.id - Confidential & Proprietary Paper.id Data Governance

    Implementation - PAM (Privilege Access Management) Implementation - Documentation Metabase on VM Store metadata at Postgre Use python to populate (airflow for automation) Store to database, show in Metabase (limited)
  18. 19 Copyright@2025 Paper.id - Confidential & Proprietary Paper.id Data Governance

    Implementation - PAM (Privilege Access Management) Implementation - Documentation Document all the dashboard creation to trace the published data
  19. 20 Copyright@2025 Paper.id - Confidential & Proprietary Paper.id Data Governance

    Implementation - Data Request SOP 1. External party data request always ask the NDA proof. NO NDA = NO DATA 2. Create the SOP (both internal & external request) a. Implement Data Encryption b. Access Control (Online Data) c. PII Data Masking d. Secure File Transfer (SFTP) e. File Protection (Password) f. Data Retention Plan 3. Document everything for Audit a. List of data that has been requested, includes the requester access b. Set the PIC (in on-call cadence)
  20. 21 Copyright@2025 Paper.id - Confidential & Proprietary Paper.id Data Governance

    Implementation - Data Recovery Plan - Implementation 1. Simply said, “If somebody delete the dataset / table accidentally ((or intentionally)), how long it takes to recover the deleted data?” 2. This framework tells us to be prepared for the storm whenever it comes a. Backup strategy b. Recovery strategy c. Testing and maintenance d. Roles and responsibilities 3. Document everything for Audit a. Incident detail b. PIC to do DRP A data mart table just got deleted accidentally Run a script for recovery Table completely recovered Do the SOP for DRP Panicked, open your phone Record “Velocity” video Table completely not recovered
  21. 22 Copyright@2025 Paper.id - Confidential & Proprietary Paper.id Data Governance

    Implementation - Incident Management - Workflow 1. When things go wrong, how quickly can we respond? 2. This framework provides a structured approach to handling data incidents: a. Clear detection mechanisms b. Rapid response coordination c. Effective communication d. Resolution and documentation 3. War Room Protocol - The command center for crisis management a. Incident detection through monitoring channels b. Immediate team assembly with defined roles c. Centralized decision-making process d. Real-time stakeholder communication 4. On-Call Rotation - Someone is always watching a. Primary and backup responders b. Escalation paths for critical incidents c. Response time SLAs (15min acknowledgment) d. Write the Post Mortem documentation Incident Detection War Room Initiation Stakeholder Communication Resolution Implementation Post-Incident Review Documentation
  22. 23 Copyright@2025 Paper.id - Confidential & Proprietary Paper.id Data Governance

    Implementation - PII Data Management Paper.id use Data Build Tool (dbt) to do the data transformation and metadata management. So here is the snapshot on how we do the PII data management in DBT. 1. DBT Tag - PII Classification, centralize the definition and document the tagging standards including the validation test 2. Separate the dataset storage for table that contains more than 80% pii data to the another dataset for data warehouse and data mart strategy (accessible only from data team)
  23. 24 Copyright@2025 Paper.id - Confidential & Proprietary Paper.id Data Governance

    Implementation - AI Automation - PII Data Management We achieved 100% of metadata documentation for all data models in dbt & bigquery by utilizing AI Automation 1. Generate sample query (limit 5) 2. Process with AI a. Column Metadata File (.yaml) i. Column description ii. Tags (PII & column categorization) iii. DBT Test b. Table Metadata File (.md) i. Table description ii. Potential analysis
  24. 25 Copyright@2025 Paper.id - Confidential & Proprietary Paper.id Data Governance

    Implementation - AI Automation - PII Data Management
  25. 26 Copyright@2025 Paper.id - Confidential & Proprietary Paper.id Data Governance

    Implementation - AI Automation - PII Data Management Layer Name Stage Name Time Before AI Automation Time After AI Automation L1 - Data Lake Create sql script for L1 CDC 5 - 10 minutes 1 minute L2 - Data Staging Create sql script for L2 10 - 20 minutes 1 minute L3 - Data Warehouse Create sql script for L3 5 - 10 minutes 1 minute (all processes) Create yaml script for L3 10 - 15 minutes Create md script for L3 5 - 10 minute Total 30 - 60 minutes 3 minutes Successfully improved the working efficiency for 10 - 20 times
  26. 27 Copyright@2025 Paper.id - Confidential & Proprietary Paper.id Data Governance

    Implementation - AI Automation - PII Data Management
  27. 28 Copyright@2025 Paper.id - Confidential & Proprietary Paper.id Data Governance

    Implementation - AI Automation - Pull Request Reviewer
  28. 29 Copyright@2025 Paper.id - Confidential & Proprietary Paper.id Data Governance

    Implementation - AI Automation - Data Governance Document Completion Data Inventory and Mapping Data Anonymization and Pseudonymization Data Access Control Matrix Data Pipeline Security Configuration Data Retention Implementation Guide Data Breach Detection and Incident Management Technical documentation capturing all databases, data flows, and storage locations containing personal data, with classifications (specific vs. general), retention periods, and access controls. This serves as the foundation for all compliance activities. Technical specifications for methods to de-identify personal data for analytics, testing, and other non-production environments, including hashing algorithms, tokenization techniques, and data masking rules. Detailed RBAC (Role-Based Access Control) implementation document specifying which roles can access which categories of data, with technical controls for enforcement in databases, data warehouses, and analytics platforms. Technical specifications for securing ETL/ELT processes, ensuring personal data is protected during extraction, transformation, loading, and at rest in data lakes/warehouses. Technical procedures for automated data purging, archiving, and deletion across systems based on retention schedules, including logging mechanisms to prove compliance. Technical monitoring configurations, alert thresholds, and automated response procedures to identify and contain potential data breaches within timeframes that support the 3×24 hour notification requirement. Technical Documentation for UU PDP Compliance
  29. 30 Copyright@2025 Paper.id - Confidential & Proprietary Paper.id Data Governance

    Implementation - AI Automation - Data Governance Document Completion PII Data Strategy Data Protection Testing Strategy Data Ethics Guidelines Data Compliance Roadmap High-level strategy for minimizing PII collection, implementing privacy by design principles, and ensuring lawful processing throughout the data ecosystem. Approach for regular security testing of data systems, including penetration testing, vulnerability scanning, and validation of access controls for PII. Strategic principles for ethical use of personal data beyond legal compliance, especially for advanced analytics, AI, and machine learning applications. Strategic planning document outlining the implementation sequence for technical controls, with prioritization based on risk levels and UU PDP requirements. Strategic Documentation for UU PDP Compliance
  30. 31 Copyright@2025 Paper.id - Confidential & Proprietary Paper.id Data Governance

    Implementation - AI Automation - Data Governance Document Completion Ask AI to create prompt for Document Description Document Creation by AI Document Review by Human Document Completion by Human *Documentation writing we suggest to use Claude Sonnet model (higher output token, more natural writing style) Repeat
  31. 32 Copyright@2025 Paper.id - Confidential & Proprietary Paper.id Data Governance

    Implementation - AI Automation - Data Governance Document Completion
  32. 33 Copyright@2025 Paper.id - Confidential & Proprietary Paper.id Data Governance

    Implementation - AI Automation - Data Governance Document Completion Ask AI to create folder structure Ask AI to fulfill the document Evaluate and Finish
  33. Paper.id Story on Data Governance Journey How do we “achieve”

    the unpredictable stage at the minimum data team member Copyright@2025 Paper.id - Confidential & Proprietary Webinar Series TGIF #1
  34. 35 Copyright@2025 Paper.id - Confidential & Proprietary Paper.id Data Team

    The foundation of data team of Paper.id relies here [2018] Establishment of Data Div [Q2 2023] First Data Lead Joined Only data analyst, no data engineer / science Data analyst and data science are directly report to Data Lead [Q1 2024] Data Team Restructuring Data analyst and data scientist were separated into two different groups, and we firstly “hired” the first data engineer from internship position. [2024 onwards] AI-Driven Development Due to the scarcity of manpower of data engineer (until now we have only two full time data engineers to fulfill the entire request from all team members of Paper.id, including to ensure the innovation still running, data migration and data governance project etc), we embed AI into our lifestyle of working at Paper.id (A mandatory skill set at our team)
  35. 36 Copyright@2025 Paper.id - Confidential & Proprietary Paper.id Data Team

    Very limited amount of manpower indefinitely limit our outcome Prioritization = Core Problem Cost Management (Reduction) Data Migration (New tools) Legacy Tool Management Data Tool Research Data Governance Project Business As Usual Data Request Data Quality Management Personal Skill Development Certification (ISO, PCI DSS) AI helps us to be “another hand” to make our work easier and faster
  36. 37 Copyright@2025 Paper.id - Confidential & Proprietary Paper.id Data Team

    How “AI” powers us to get deliver more results Junior Engineer AI “level up” the capability of our team member, able to develop with higher quality, more efficient, and more outcomes Senior Engineer Senior Engineer Senior Engineer