A constant source of grief on city roads, traffic is a fact of life for city workers, with many forced to endure the same monotony for sometimes hours at a time. Bangkok, the capital city of Thailand, faces this issue constantly, being ranked as the 52nd most congested city in the world.
While we recognize that large-scale improvements to commuting time can likely only be achieved by changing the layout and routing of the roads themselves, one of the means of improvement on a smaller scale is left to optimize the existing network. AI-driven Traffic Management for a Better Bangkok (ATMBB) aims to act as a solution.
When finalized, our project aims to create a traffic signal calibration system which takes footage from the native ecosystem of surveillance cameras, map them, label them, and pass them through a Machine Learning model to glean information about the given footage; this will provide us with the number of vehicles queuing on each side of the intersection. A visualization dashboard will be set up, so that the current timings and detections can be observed by the relevant controlling bodies; this will also include heatmaps which will give the viewer a better sense of the traffic density of a particular area.
3.1 Dataset Acquisition
One of the differences between our project this semester and that of the past is that our advisor was already in contact with groups that could possibly provide us with access to a large cache of usable data. The dataset is from the CCTV system; belonging to the “Samyan” group which was lead and managed PMCU which was Chulalongkorn University’s Property Management sector.
After the dataset has been gathered, it was dispersed amongst the team for labeling, which was done on Roboflow. Each of the member’s progress was tracked by Airtable and submissions were accumulated using a shared Google Drive. Firstly, the raw footage was passed through a Python script which samples an image once every hour of recording, collecting in a designated output file. After the annotation process is completed, the Roboflow platform gives the labeler an option to choose what model type the data would be used in and exports the dataset in a format corresponding to the given model choice. Then each segment is merged together to form our base dataset.
Initially, we were able to reproduce a basic YOLO model with a mean confidence score of around 70%; this was done using a base version 5 model with none of the additional algorithms. The SAHI algorithm was first used to better both the detection rate and confidence score. In a later stage, DeepSORT was implemented to solve the issue of duplication by way of occlusion.
Summary of Accomplishments
We were able to produce a vehicle detection model with a satisfactory level of accuracy. Furthermore, both car count and velocity can be derived through the use of a modified script, and the construction of a heatmap was successful.
there are still a number of functionalities considered part of the minimal viable product that has yet to be completed by our team: signal control method, signal optimization model, and advanced 2d simulated visualizations. Furthermore, latency will become an even bigger issue the more we add onto the base model; this may trigger a model migration, possibly to one of NVIDIA’s pre-existing models or frameworks.