Data Scientist
Developing an AI-based interactive sport coach to support and generate athletes' workouts.
Developing a GenAI chat application for a sport platform.
Working on and managing upscaling LLMs deployment.
Working on a POC for deploying LLM to AWS Inferentia.
Collaborating with other teams and stakeholders to develop new features.
Technologies used: Python, PyTorch, FastAPI, Hugging Face, Chainlit, OpenAI, vLLM, Bash, Docker, Kubernetes, Ray Serve, Elasticsearch, GitLab, AWS (EC2 & Inferentia), and many more.
Lead Staff \ Senior Engineer Specialist
-
Using AWS services to solve clients' requirements.
-
Allocating human resources to various projects based on the project's assessment.
-
Project: Water Distribution Networks at PT Tirta Raharja (a water distribution company)
- Developed a custom intelligent system
- Worked on time-series data
Technologies used: Python, AWS (SageMaker), Pandas, Scikit-learn
-
PoC: Image detection on cigarrete packs at Gudang Garam Group
- Fine-tuned Amazon Rekognition
- Implemented data augmentation
Technologies used: Python, AWS (SageMaker, Rekognition)
-
Initiate end-to-end ML System
- Developed ML Pipeline
- Developed Explainable-AI Analysis
- Developed Dashboard
- Developed CI/CD Pipeline for ML System
- Developed IaC for ML infrastructure on the cloud
Technologies used: AWS (SageMaker, ECR, S3, EC2, EventBridge, CodeCommit, CodeBuild, CodePipeline, CloudFormation, SNS), Streamlit, Plotly, SHAP, XGBoost
-
Project: Data Warehouse Migration at PT Bank BTPN Syariah, Tbk
- Developed ETL Framework for preprocessing data from the on-premise server to AWS Cloud
- Developed ETL pipeline for daily data processing
- Converted SQL Server queries to Spark SQL queries
- Developed Scheduler Workflow to automate the ETL process
Technologies used: AWS (IAM, S3, Glue, Lambda, Step Functions, RDS, Athena, EventBridge, CloudWatch, SNS), Python, MySQL, PySpark, Microsoft SSIS, SQL Server
-
Project: Operational Data Source at PT Telekomunikasi Seluler
- Developed ETL pipeline for data validation check
- Developed ETL pipeline to preprocess data from Phoenix (hosted on EMR) to S3 & Athena
Technologies used: AWS (IAM, S3, Lambda, Glue, Athena, Step Functions, EventBridge, CloudWatch, SNS, EMR), Python, PySpark, Phoenix
-
Project: FastDB at PT Telekomunikasi Seluler
- Converted Kinetica Stored Procedures queries to Redshift Stored Procedures queries
Technologies used: Amazon Redshift, Kinetica
-
Product Update: Centralized Logging at PT Lintasarta
- SIEM (Security Information and Event Management) Dashboard
Technologies used: AWS (IAM, OpenSearch, CloudFormation, S3, Lambda, SNS)
-
PoC: GenAI Chatbot at PT Telekomunikasi Seluler
- Developed Chatbot Application
- Used Claude LLM hosted on Amazon Bedrock
- Created Conversational AI + RAG Application
Technologies used: AWS (EC2, Lambda, Bedrock, Kendra, Aurora PostgreSQL with pgvector), Claude, OpenAI, LangChain, FastAPI, Prompt Engineering
-
Internal Project: GenAI QnA for internal documents of PT Mastersystem Infotama, Tbk
- Developed Chatbot Application
- Used OpenAI's GPT-3.5 LLM
- Created RAG Application
Technologies used: Amazon EC2, Chroma, OpenAI, LangChain, Streamlit, Prompt Engineering
Senior Data Scientist
Worked as the PIC of Data Science for shop recommendations.
Developed complex recommendation systems for online sellers through various merchant campaign tools.
Analysed hundreds of millions of data points on products and shops.
Collaborated with other teams and stakeholders to develop new features.
Technologies used: Python, Pandas, Numpy, Scikit-learn, GCP, BigQuery, Plotly, Bash, Flask, Docker, GitHub, and many more.
Head of Legal Data Analytics
Helped governmental and corporate organizations to implement Big Data and AI based law, legal, and social solutions.
Translated requirements from users or management levels into Big Data and AI solutions.
Managed a team full of data scientists to develop and deliver Big Data and AI solutions.
Data Scientist
Involved in a data science team to work on the project to implement Natural Language Processing for Indonesian law.
Developed named entity recognition (NER) model for Indonesian law using spaCy, Keras, Tensorflow, and IndoBERT pre-trained model.
Used Elasticsearch to analyse text data.
Developed WhatsApp Bot and Telegram Bot.
Analysed public transport data (ticketing) for gaining insights.
Developed a methodology to categorise potential risks in social media and news texts.
Developed Text Summarisation to summarise Registrasi docs of Mahkamah Konstitusi (Applications of Indonesian Constitutional Court)
Developed BERT-based Text Similarity
Worked on online news analytics
Worked on social media analytics
Developed Affective Text Generation
Managed model deployments
Technologies used: Python, Pandas, Numpy, Scikit-learn, MariaDB, SpaCy, NLTK, Gensim, FastText, Plotly, Dash, Keras, TensorFlow, PyTorch, Transformers, AWS, MLflow, DVC, Elasticsearch, Kibana, Bash, Flask, FastAPI, Docker, GitLab, Gephi, and many more
Data Scientist
Involved in a data science team to work in the project to implement Natural Language Processing for Text Similarity in law.
Involved in a voluntary data science team to work in the project of COVID-19 PeduliLindungi App. Scrapped COVID-19 data from various official websites of regional governments of Indonesia.
Implemented Anomaly Detection in the project of big data analytics at PT Pupuk Indonesia.
Technologies used: Python, Pandas, Numpy, Scikit-learn, Beautiful Soup, Selenium Webdriver, MariaDB, SpaCy, NLTK, Flask, and many more
Engineer in Reliability Analytics
Performed statistics-based (e.g. descriptive, inferential, regression) reliability prediction analysis (developed in MATLAB) on passenger coach whose results are used as a base for managerial decision.
Assigned to develop a dedicated-for-INKA concept of work for RAMS analysis (Reliability, Availability, Maintainability and Safety) of the rolling stock system.
As the main contributor for the development of RAMS management process at the company.
Performed RAMS management Process using EN 50126.
Technologies used: MATLAB, Regression, Maximum Likelihood Estimate. Monte Carlo Simulation
Junior Expert in Reliability Analytics
Performed statistics-based (e.g. descriptive, inferential, regression) reliability prediction analysis.
Developed FTA (Fault Tree Analysis) in R using RStudio.
Involved in the LRT Greater Jakarta project.
Working in an engineering team collaborated with PT INKA.
Performing RAMS (Reliability, Availability, Maintainability and Safety) analysis (e.g. FTA, FMEA, MTTR, MTBF, EN 50126) for the systems of doors, HVAC, bogie, wiring and piping connections, and control panels of the rolling stocks.
Technologies used: R, MATLAB, Regression, Maximum Likelihood Estimate
Postgraduate Researcher
Used BRaVE—a microscopic simulator written in Java developed at the University of Birmingham—to apply three-aspect signalling and timetable for the route from Wirksworth to Duffield via Shottle, United Kingdom, with one train for each way every half of an hour.
Based on MATLAB, I developed ATO system on DLR (Docklands Light Railway) Line in London.
Technologies used: MATLAB, Simulink, BRaVE, Fuzzy Gain Scheduling, Kalman Filter, Statistics, Adaptive Controller, PID Control, Automatic Train Operation Control