▷ Hi! I am Sadat Shahriyar, an ML Engineer at Samsung R&D Institute. I obtained my B.Sc in Computer Science from Department of CSE, Bangaldesh University of Engineering and Technology (BUET). My research interest broadly lies in the field of Natural Language Processing, Software Engineering, Reliability of Software Systems and Trustworthy AI. My current career goal is to pursue a graduate program in my field of interest.
▷ With over a year of experience as an ML engineer at Samsung R&D Institute, I developed solutions using LLMs, VLMs, Transformers, and Vision models in Software Engineering and Testing domain. I contributed to two projects, contributed to 2 patents, worked as an Instructor in Gen AI training and created multiple proof-of-concept tools. For my contribution to research work, I was awarded the Excellence Award for Innovation from Samsung R&D Institute.
▷ Previously I worked as a Graduate Research Assistant at BUET on a collaborative project with Samsung R&D Institute, supervised by Dr. Anindya Iqbal, Sukarna Barua, and Tahmid Hasan. We studied the effectiveness of Large Language Models in generating and executing test cases for Android applications.
▷ For my undergraduate thesis, supervised by Dr. Anindya Iqbal and Dr. Shahrear Iqbal, we studied the efficacy of language models in detecting Android malware using system call sequences.
▷ In my undergraduate life at BUET, I've regularly participated and excelled at multiple competitions including Dhaka AI 2020 and HackNSU 2020. I have taken leadership roles as an ML Instructor for internal Generative AI training at my workplace and during my undergrad, particularly at departmental events like the annual BUET CSE FEST.
▷ In my leisure time, I enjoy travelling, listening to music and binge-watching sitcoms, tv-series and animes.
This research project, conducted at Samsung R&D Institute, focused on evaluating the effectiveness of large language models (LLMs) in executing test cases. We developed a framework that includes a robust approach for identifying the relevant UI elements needed to perform actions based on test cases using LLM, followed by interacting with these elements using Selenium. Subsequently, we applied semantic similarity-based textual matching between the final screen resulting from the UI interactions and the expected outcome defined in the test case to generate a verdict. My specific contributions involved simplifying web pages into a more generalized format for LLM comprehension, conducting few-shot prompting experiments with LLMs (such as, Solar-10.7B, StableBeluga-13B, LLaMa-2-7B), preparing dataset, fine-tuning the LLMs (LLaMa-3-8B, Solar-10.7B) model on annotated dataset for UI element identification and generating semantic similarity-based verdicts using all-MiniLM-L6-v2 (Sentence Transformer model), given the final screen and expected result. Our approach successfully executed 106 out of 110 targeted test cases, achieving an accuracy of 97%.
This research project, conducted at Samsung R&D Institute, investigates the efficacy of Large Language Models (LLMs), Vision-Language Models (VLMs), and Computer Vision Models in generating test cases from UI guides. We developed a framework that employs a robust method for detecting functional flows within software from UI guides using Vision-Language Models (VLMs), Large Language Models (LLMs), and Computer Vision models, converting these functional flows into test cases through an algorithmic approach. Additionally, we generated expected outcomes for specific functional flows using VLMs. My contributions included conducting few-shot prompting experiments and fine-tuning VLMs (such as CogVLM and Idefics2) for UI screen identification and functional flow mapping. I also performed UI screen matching with their corresponding descriptions using the LLaMa-3-8B model. Our framework effectively generates test cases from extensive UI guides with minimal error and within a reasonable time frame.
This project is a collaboration between Samsung R&D Institute and BUET where we studied the effectiveness of Large Language models in generating and executing test cases for android applications under the supervision of Dr. Anindya Iqbal. We developed an automated UI testing tool for Android applications that can significantly reduce manual testing efforts by 80% by automatically generating test cases. The tool utilizes a custom DFS algorithm using Appium to traverse the application's UI and write test steps. Additionally, EfficientNet and MaxViT are employed for image classification of unknown screen elements and screen similarity matching for maintaining DFS state, while the Flan-T5 generative model is used to understand screen representations and generate expected results for each test step.
Undergraduate thesis project under the supervision of Dr. Anindya Iqbal and Dr. Shahrear Iqbal (Research Officer, National Research Council (NRC) Canada). In this research, we created a framework to efficiently process sequences of system calls, making them more accessible for language models to identify malicious patterns and detect Android malware. We utilized a Transformer models to analyze these sequences for malware detection. We fine-tuned various language models (BERT, RoBERTa, BigBird, LongFormer) with different sequence lengths and techniques such as supervised fine-tuning and contrastive learning. We then compared their performance against existing models like LSTM, Random Forest, and SVM. Our evaluation demonstrated a 6-7% improvement over the previous state-of-the-art model.
[Thesis dissertation]This is a project under the supervision of Dr. Anindya Iqbal where our aim is to produce more human like reviews and automating review generation procedure through large language models. We are working on the latest RL based preference optimization to finetune state of the art open source LLMs and figure out more suitable evaluation metric to find out human likeliness of code review. The knowledge domains we need in this study are: Code review, LLM inference, LLM finetuning and Objective evaluation.