Since graduating in computer science at the Ho Chi Minh City University of Natural Sciences, Nguyen Hoang Bao Dai has been working as an AI research engineer in natural language processing and computer vision. In addition to his passion for computer science, Bao Dai has also been interested in music since childhood. He is the composer and performer of the music video entitled ‘Dan IT’ (IT programmers) that has won much interest from viewers on YouTube.
Bao Dai said that he often works on the melody first, then the chords and lyrics because he thinks that a good song must first have a beautiful melody. During the composition period, Dai finds that creating melody is the most time-consuming stage with many songs taking him months to complete after editing the melodies many times.
This fact brought him to the idea of his applying knowledge of AI to support his passion for music. “I think that if AI can draw pictures, it can write music, too. Therefore, I plan to create an AI model that can generate music according to Vietnamese people’s musical tastes," said Bao Dai.
He embarked on research for the project in early 2017. After two years of research, his music generation model with AI debuted and has surprised people by generating ten melodies within just one second. Under his model, the composer only needs to input a few notes, the system will process the data, convert it into a multi-digit vector and create longer melodies.
The AI’s random algorithm allows the system to generate completely different versions of the tune so that composers can select and make adjustments as desired. This model helps musicians shorten their working time by suggesting melodies which they may find inspiring, so that musicians have more time to refine the remaining stages, such as writing harmony, lyrics, and the arrangement of the song.
According to Bao Dai, there have been researches on AI models in generating music around the world, but most of the models work on classical music, which is very different from the musical tastes of Vietnamese people.
Therefore, Dai had to build his own algorithms for his model generate pop ballads which win much favoured of Vietnamese people. The biggest challenge for the project is to collect an input data source which is abundant enough to “train” the model. In order for the model to be able to compose young music, it must have a data source about young music. However, this data cannot be extracted from YouTube songs or music websites because these are finished products, while the system can only read and understand input data in the form of MIDI files.
To solve this problem, Bao Dai had to play and record the tunes of popular Vietnamese songs on the electric piano by himself and then processed them into MIDI files. Among the 30,000 song files that the model has stored, only about 5,000 of them were sourced from music forums while the remaining 25,000 songs were played by Bao Dai himself.
Bao Dai also faced challenges regarding the operating system, because desktops or laptops do not have enough computing power to train AI, while his budget couldn’t afford to invest in a new powerful operating system. Therefore, Dai decided to rent a machine using cloud computing and paying fees according to usage.
When questioned on whether the AI music generating model may increase the dependence on technology and limit the musician’s creativity and ability in music production, Bao Dai said that the model only helps composers save time in generating music by providing reference and options rather than an almighty tool to replace humans in music. He stressed that each music product must convey the unique style, emotion and personal signature of the musician in order to gain a place in listeners’ hearts, which are irreplaceable.
Bao Dai revealed he will continue to conduct research to further develop the ability to make chords and write lyrics for his model so that it can become an effective assistant for musicians while fully optimising the features of the model.