Close

Reza Dwi Utomo

Data Scientist | ML Engineer | AI Enthusiast

~ A lifelong learner ~

Download Resume

About Me

As an engineer with a focus on data-driven analysis, I have 6 years of diverse experience, including independent research in intelligent control, machine learning, and reliability in manufacturing, where the industries are railway research, manufacturing, telecommunication, legal, e-commerce, and IT consultant. I work currently as a Lead Staff / Senior Engineer Specialist in Data & AI at Mastersystem Infotama, a leader in ICT System Integrator.

I am highly interested in the application of AI and technology, particularly in areas such as intelligent transport systems, natural language processing, computer vision, big data, data science, and other intelligent applications. I am skilled in analytical programming languages such as Python, MATLAB, and R, and am proficient in the AWS and GCP cloud environments. I hold several data science and machine learning certificates from AWS, Udacity, deeplearning.ai, Microsoft, and IBM.

Latest Experience

Mastersystem Infotama

Lead Staff \ Senior Engineer Specialist

  • Using AWS services to solve clients' requirements.
  • Allocating human resources to various projects based on the project's assessment.
  • Initiate end-to-end ML System
    • Developing ML Pipeline
    • Developing Explainable-AI Analysis
    • Developing Dashboard
    • Developing CI/CD Pipeline for ML System
    • Developing IaC for ML infrastructure on the cloud
    Technologies used: AWS (SageMaker, ECR, S3, EC2, EventBridge, CodeCommit, CodeBuild, CodePipeline, CloudFormation, SNS), Streamlit, Plotly, SHAP, XGBoost
  • Project: Data Warehouse Migration at PT Bank BTPN Syariah, Tbk
    • Developed ETL Framework for preprocessing data from the on-premise server to AWS Cloud
    • Developed ETL pipeline for daily data processing
    • Converted SQL Server queries to Spark SQL queries
    • Developed Scheduler Workflow to automate the ETL process
    Technologies used: AWS (IAM, S3, Glue, Lambda, Step Functions, RDS, Athena, EventBridge, CloudWatch, SNS), Python, MySQL, PySpark, Microsoft SSIS, SQL Server
  • Project: Operational Data Source at PT Telekomunikasi Seluler
    • Developed ETL pipeline for data validation check
    • Developing ETL pipeline to preprocess data from Phoenix (hosted on EMR) to S3 & Athena
    Technologies used: AWS (IAM, S3, Lambda, Glue, Athena, Step Functions, EventBridge, CloudWatch, SNS, EMR), Python, PySpark, Phoenix
  • Project: FastDB at PT Telekomunikasi Seluler
    • Converting Kinetica Stored Procedures queries to Redshift Stored Procedures queries
    Technologies used: Amazon Redshift, Kinetica
  • Product Update: Centralized Logging at PT Lintasarta
    • SIEM (Security Information and Event Management) Dashboard
    Technologies used: AWS (IAM, OpenSearch, CloudFormation, S3, Lambda, SNS)
  • PoC: GenAI Chatbot at PT Telekomunikasi Seluler
    • Developed Chatbot Application
    • Used Claude LLM hosted on Amazon Bedrock
    • Created Conversational AI + RAG Application
    Technologies used: AWS (EC2, Lambda, Bedrock, Kendra, Aurora PostgreSQL with pgvector), Claude, OpenAI, LangChain, FastAPI, Prompt Engineering
  • Internal Project: GenAI QnA for internal documents of PT Mastersystem Infotama, Tbk
    • Developed Chatbot Application
    • Used OpenAI's GPT-3.5 LLM
    • Created RAG Application
    Technologies used: Amazon EC2, Chroma, OpenAI, LangChain, Streamlit, Prompt Engineering

Tokopedia

Senior Data Scientist

  • Worked as the PIC of Data Science for shop recommendations.
  • Developed complex recommendation systems for online sellers through various merchant campaign tools.
  • Analysed hundreds of millions of data points on products and shops.
  • Collaborated with other teams and stakeholders to develop new features.
  • Technologies used: Python, Pandas, Numpy, Scikit-learn, GCP, BigQuery, Plotly, Bash, Flask, Docker, GitHub, and many more.

    Legal Analytics (Powered by Telkom Indonesia)

    Head of Legal Data Analytics

  • Helped governmental and corporate organizations to implement Big Data and AI based law, legal, and social solutions.
  • Translated requirements from users or management levels into Big Data and AI solutions.
  • Managed a team full of data scientists to develop and deliver Big Data and AI solutions.
  • PT Telkom Indonesia (Persero), Tbk

    Data Scientist

  • Involved in a data science team to work on the project to implement Natural Language Processing for Indonesian law.
  • Developed named entity recognition (NER) model for Indonesian law using spaCy, Keras, Tensorflow, and IndoBERT pre-trained model.
  • Used Elasticsearch to analyse text data.
  • Developed WhatsApp Bot and Telegram Bot.
  • Analysed public transport data (ticketing) for gaining insights.
  • Developed a methodology to categorise potential risks in social media and news texts.
  • Developed Text Summarisation to summarise Registrasi docs of Mahkamah Konstitusi (Applications of Indonesian Constitutional Court)
  • Developed BERT-based Text Similarity
  • Worked on online news analytics
  • Worked on social media analytics
  • Developed Affective Text Generation
  • Managed model deployments
  • Technologies used: Python, Pandas, Numpy, Scikit-learn, MariaDB, SpaCy, NLTK, Gensim, FastText, Plotly, Dash, Keras, TensorFlow, PyTorch, Transformers, AWS, MLflow, DVC, Elasticsearch, Kibana, Bash, Flask, FastAPI, Docker, GitLab, Gephi, and many more

    CODEX (Powered by Telkom Indonesia)

    Data Scientist

  • Involved in a data science team to work in the project to implement Natural Language Processing for Text Similarity in law.
  • Involved in a voluntary data science team to work in the project of COVID-19 PeduliLindungi App. Scrapped COVID-19 data from various official websites of regional governments of Indonesia.
  • Implemented Anomaly Detection in the project of big data analytics at PT Pupuk Indonesia.
  • Technologies used: Python, Pandas, Numpy, Scikit-learn, Beautiful Soup, Selenium Webdriver, MariaDB, SpaCy, NLTK, Flask, and many more

    PT Industri Kereta Api (Persero)

    Engineer in Reliability Analytics

  • Performed statistics-based (e.g. descriptive, inferential, regression) reliability prediction analysis (developed in MATLAB) on passenger coach whose results are used as a base for managerial decision.
  • Assigned to develop a dedicated-for-INKA concept of work for RAMS analysis (Reliability, Availability, Maintainability and Safety) of the rolling stock system.
  • As the main contributor for the development of RAMS management process at the company.
  • Performed RAMS management Process using EN 50126.
  • Technologies used: MATLAB, Regression, Maximum Likelihood Estimate. Monte Carlo Simulation

    Centre of Technology for Transportation System and Infrastructure, Agency for the Assessment and Application of Technology (BPPT)

    Junior Expert in Reliability Analytics

  • Performed statistics-based (e.g. descriptive, inferential, regression) reliability prediction analysis.
  • Developed FTA (Fault Tree Analysis) in R using RStudio.
  • Involved in the LRT Greater Jakarta project.
  • Working in an engineering team collaborated with PT INKA.
  • Performing RAMS (Reliability, Availability, Maintainability and Safety) analysis (e.g. FTA, FMEA, MTTR, MTBF, EN 50126) for the systems of doors, HVAC, bogie, wiring and piping connections, and control panels of the rolling stocks.
  • Technologies used: R, MATLAB, Regression, Maximum Likelihood Estimate

    Birmingham Centre for Railway Research and Education (BCRRE)

    Postgraduate Researcher

  • Used BRaVE—a microscopic simulator written in Java developed at the University of Birmingham—to apply three-aspect signalling and timetable for the route from Wirksworth to Duffield via Shottle, United Kingdom, with one train for each way every half of an hour.
  • Based on MATLAB, I developed ATO system on DLR (Docklands Light Railway) Line in London.
  • Technologies used: MATLAB, Simulink, BRaVE, Fuzzy Gain Scheduling, Kalman Filter, Statistics, Adaptive Controller, PID Control, Automatic Train Operation Control
    View More Experience

    Education

    University of Birmingham, UK

    Aug 2016 - Sep 2018

    Master of Research in Railway Systems Integration

    Core modules: Mathematics as an Engineering Tool, Railway Operations and Control Systems, Railway Traction Systems Design, Railway Control Systems Engineering, Research Skills and Research Environment.

    The thesis topic is to develop ATO (Automatic Train Operation) Control Systems with Kalman filter (developed in MATLAB) with a case study of Docklands Light Railway in London. A part of the thesis presented at the IEEE International Conference on Intelligent Rail Transportation (ICIRT) December 2018 in Singapore and published on IEEExplore.

    Diponegoro University, Indonesia

    Sept 2010 - Aug 2015

    Bachelor of Engineering in Computer Engineering

    Core modules: artificial intelligence, fuzzy logic, neural networks, real-time operating system (RTOS), microprocessor design, embedded system and distributed embedded system.

    Final project topic was to develop fuzzy logic controller (developed in MATLAB/Simulink) able to control transfer function of the train–published in IEEE Conference Proceedings.

    Featured Side Projects

    How Much Do You Know to Enter Data Science Field?

    Since a couple of years ago, Data Science's hype has been increasing. Many are trying to enter the field, no matter men or women. However ...

    View Project

    Disaster Response Pipeline Project

    In this project, you'll find a web app where you can input a new message and get classification results in several categories. The web app will also display visualizations of the data ...

    View Project

    Starbucks Capstone Project

    This project is for the Starbucks Capstone Challenge of the Data Scientist Nanodegree in Udacity. The dataset provided are simulated data that mimics customer behavior on the Starbucks rewards mobile app ...

    View Project

    SpaCy NER

    This repo is about how-to-use Indonesian NER with SpaCy. Repo ini berisi file dan folder yang dibutuhkan dalam memahami penggunaan spaCy untuk men-training NER berbahasa Indonesia.

    View Project

    Scraping Covid-19 Websites

    This repo contains py scripts to scrape information regarding Covid-19 in Bali province and its regencies ...

    View Project

    Life Expectancy Shiny Dashboard

    A Data Visualization project. Life Expectancy is displayed in a webdashboard made using Shiny Dashboard. The webdashboard can be seen here

    View Project
    More Projects

    Trainings & Certificates

    AWS Certified Machine Learning - Specialty

    Issued by AWS
    Issued Oct 2023 | Expires Oct 2026
    See credential

    AWS Certified Data Analytics - Specialty

    Issued by AWS
    Issued Aug 2023 | Expires Aug 2026
    See credential

    Microsoft Certified: Azure Fundamentals

    Issued by Microsoft
    Issued Jan 2020 | No Expiration Date
    See credential

    Data Scientist Nanodegree

    Issued by Udacity
    Issued Dec 2021 | No Expiration Date
    See credential

    Microsoft Certified: Azure AI Fundamentals

    Issued by Microsoft
    Issued Jul 2021 | No Expiration Date
    See credential

    Natural Language Processing Specialization

    Issued by deeplearning.ai
    Issued Jul 2021 | No Expiration Date
    See credential
    More Trainings and Certificates

    Latest Blog Articles

    Skills

    Get in Touch