ABCDEFGHIJKLMNOPQRSTUVWXYZ
1
MSc StreamDissertation (Working) TitleDescription
2
DSExphormer: Scaling transformers for graph-structured dataSee this blog post https://blog.research.google/2024/01/exphormer-scaling-transformers-for.html
3
ASD/DSMusically Anomaly Detection This project will look at different ways to bring your data alive with music and in such a way anomalies in the data will be amplified in the music.
Here is an example https://mltechniques.com/2022/08/29/the-sound-that-data-makes/
Using music can enable real-time anomaly detection in an audable way, instead of running scripts. Based on windowizing data this apporach can lead to earlier detection.
4
ASD/DSSckit-LLMIn a similar manner to the project listed below, how can this be used to improve the process of generating machine learning models. Comparison and benchmarking
https://github.com/iryna-kondr/scikit-llm?utm_source=tldrai
5
ASD/DSDB-GPTCan you build upon and utilize this is the real-world : DB-GPT project to build a complete private large model solution for all database-based scenarios. This solution supports local deployment, allowing it to be applied not only in independent private environments but also to be independently deployed and isolated according to business modules, ensuring that the ability of large models is absolutely private, secure, and controllable.
https://github.com/csunny/DB-GPT?utm_source=tldrai
6
ASDDo micro-services really need their own Database? Probably notThis project will explore various micro-services architectures and explore the effect of Database deployment across these. This will impact data pipelines, data replication, etc., and various regulatory issues/requirements. This project will explore how to develop a more balanced approach, with a single database servicing many mico-services, and will create a set of recommendations for others to follow.
7
ASDEnergy Efficiency of ORM ApproachesThere are a variety of different ORM approaches. Most have many limitations and over resent years there has been a migration away from these and developers having to write efficient SQL. This is turn has many limitations. This project will explors various aspects for example see this paper https://core.ac.uk/download/pdf/43410055.pdf
8
ASD/DSGPT-4 in the ClassroomAssessing the use of GPT-4 as a teaching aid for post-graduate module. Particularly for software engineering, coding, design etc type modules
9
ASD/DSAssessing using GPT-4 for visual accessibilitySee project below for "Walk with Me". Can something similar be done with GPT-4. What about using it to create other visual aids.
10
ASD
DS
Analysing Trends in Conference PapersUsing the data from this site on best conference papers from the 25 years, can you identify any trends, insights, etc from these papers. This can be illustrated to see how trending topics have evolved over time and how these can be related to other events in the IT industry
11
ASD
DS
MLOps / AIOpsThere are variety of different possibilities for deploying models in production. These can range from containerisation, serverless functions, virtual machines, docker, etc. Additionally these can provide different delivery mechanisms, such as APIs, REST calls.
This project will example the various solutions available on AWS, GCP, Azure, Oracle Cloud to assess the ease of initial deployment, the ease of updating models in these environments, the ability to call and use the models, and benchmarking of using these models based on number of calls per second, or similar.
12
DSUsing Pre-built Models for Image Classification and Knowledge ExtractionMost of the cloud providers (AWS, GCP, Oracle, Azure) are building ML/AI applications providing prebuilt models for image classification, object detection and text extraction. These are aimed at making it easier to use this technology and to allow a wider audience (beyond the Data Scientist) to use this functionality without the need to know or understand what is happening under the hood, at language, library, algorithm, parameters, training and test.
This project will look at evaluting the offerings from these different cloud vendors, to asses the functionality provided, the ease of use, the range of possible use cases, their accuracy at predicting and knowledge extraction, etc
All without the need to be a data scientists or machine learning expert.
The time of being an “expert” Data Scientist or Machine Learning expert has come to an end. Data Science and Machine Learning is no longer a specialist skill, but is not a generalist skill that people in IT, Social Sciences, Marketing, Physical Sciences, Chemical Sciences, and lots of other areas, have these skills.
13
ASDEvaluation of Linux distributionsLinux is one of the most common/popular environments on most servers, including database servers, website host, application tier, etc
This project will look at evaluating 5-6 different Linux environments (RedHat, Ubuntu, Oracle Linux, etc), such as open/free source Linux distributions. The evaluation will examine their performance for different use cases and with different workloads, to measure and compare their efficiency. You will be able to benchmark your results (and test environments) against publicly available results.
14
ASDEvaluation of JOOQ versus Hibernate and othersHibernate is widely used but has MANY MANY problems, which usually results in writing and executing very inefficient queries, which gives the impression the Database is running slowing, but in reality the SQL code generated by Hibernate is just BAD.
An alternative is JOOQ (http://www.jooq.org/learn/). There are several other options. This project will evaluate JOOQ versus Hibernate (and others) to explore their differences, their limitations, the issues they cause and experiment to see which one really works best.
This project requires a good understand of database internals and query optimization to complete this project.
15
DA & ASDPhotographic Restoration using Deep LearningExplore and evaluate various Deep Learning algorithms and models for restoring old photography images into something that looks recent/modern.
For example have a look at this Github repos for some examples
https://github.com/TencentARC/GFPGAN
16
DA & ASDGitHub CopilotHow good is it really?
How useful is is really?
What are the problem?
This project will perform a full evaluation of using Github Copilot for a number of applications, with comparison to code already written or written by experts, and comparison with alternative approaches
17
DA & ASDAlgorithmic TradingThere are a lot of different algorithms available for trading stocks. Lots of recent research has focused more on advanced Machine Learning techniques. But there has been mixed results along with the complexity of such solutions. Other solutions looks at including simpler machine learning algorithms and including Natural Language Processing (NLP), Sentiment Analysis, etc to make predictions.
But how do these approaches really compares with more traditional approaches used for many decades, based on simple statistics of moving averages, regression, etc
In this project you will evaluate some of these to determine what works, what kinda works, what is just too complicated, etc
There are lots of possibilities for this project and each person can have their own focus and interest areas.
Open to many students working of this simultaneously.
18
DA & ASDEvaluation of Application Translation Layers to support Application migrationsFor example, Babelfish is a translation layer for Amazon Aurora PostgreSQL that enables Aurora to understand commands from applications written for Microsoft SQL.
But can it work for other Databases?
What are the alternative solutions to Babelfish? Can you compare these products, what impact they have on portability, impact on application performance, etc
19
DA & ASDEvaluation the Ethical and Legal Implications of Data Mining, Machine Learning and AnalyticsOver the past few years there has been a growing interest in the areas of Ethics and Legal Implications of Analytics, Data Science, Data Mining, Machine Learning etc. These two topics are very different and yet they have a large overlap.
This project will examine how the EU and other political regions have adapted their legal systems. Additionally we have seem a number of different Ethics frameworks etc being put forward. Given the nature of these two topics, the project will examine these, defining overlaps, gaps, directional changes and how state of the art research can improve current practices.
20
DA & ASDEvaluation of MLOps FrameworksA large number of MLOps framworks exist and the list of these is constantly growing. But how good or how complete are they.
This project will define a set of evaluation criteria, based on research, and will then apply the criteria to 3-5 MLOps frameworks determine their completeness, efficiency and cross platform support, among other things.
21
ASDVoice Controlling a Database
Voice recognition & Text to speech
Building an Accessible Application
Everyone uses SQL to access and process data in a Database. SQL is a 40+ year old language and is very commonly used in all databases, no matter their type.
This project will look at building an accessible interface to the database. Allowing people to create SQL queries using voice instructions. These instructions can have the same structure of typical SQL statement, both ANSI and for SQL implementation of the chosen database.
Additional features will examine using a markup type syntax for SQL. There are a number of example of using this, and the project will look to implement one of the most commonly used. SQL Markup syntax is a form of short hand syntax for writing SQL. The application will take instructions from the user voice commands and will construct the SQL, providing prompts and feedback to the user as necessary.
When the query results are retrieved, the application will converse with the user on how to share these results, from providing some aggregate and summary information, to playing back all or part of the query result set.
See this article for some examples.
Some more examples
22
DA & ASDEvaluation of lite ML frameworks for deployment on IoTs, phones, tablets and other similar devicesIn recent times there was been a move to deploy ML and other advanced analytics on low power computing devices.
This project will examine the various Frameworks, Libraries and supporting languages for the deployment on such devices. Careful measurement is needed to determine and evaluate the real effect of doing this and its viability.
23
DA & ASDAdding new functionality to a Database:
Can you add Data Graphing capabilities to a Database, providing similar functionality like ggplot2, and be called using SQL
The Oracle Database allows you to extend the functionality of Database by including External Procedures (external procs). These allow you to write functions using Java and C, and have these registered in the Database, allowing these external procs to be called from the Database using SQL.
This project will look at creating functions, using external procs, to create similar functionality to ggplot2 (in R), and have this functionality accessible using SQL in your typical SELECT statements. The project will evaluate the performance of using external procs and examine the integration of this functionality with other aspects of the Database. All images produced should be in BLOB format, allowing for the querying, displaying and storing of these images in the database.
24
DAComparison and Evaluation of Machine Learning Interchange Formats and ToolsThere are a number of machine learning interchange formats including PMML, PFA, ONNX, and many more. This project will example these, developing examples to show their capabilities and weaknesses, across multiple languages. What format to use, for what kind of models, for what languages, etc all of these and many more aspects will be considered.
25
DA & ASDFake New DetectionOver the past few years we have been hearing a lot about fake news. The aim of this project is to explore the research in this area and to develop a fake news detection algorithm. Building upon previous research the project will select various elements and use these to apply it to a regional context (e.g. Ireland) and then to compare how it performs for other regions. By doing this you will be able to assess if there are geographic variations in how fake news is used around the world
26
DA & ASDAre Mobile Devices suitable for Machine Learning(almost)Everyone carries a mobile device and these contain many different applications. Recently there has been some advanced with building machine learning capabilities into these applications. Various libraries and frameworks are being made available to enable machine learning on mobile devices. But is this really possible? This project will explore the various ML solutions for mobile devices, examine the various languages/solutions and evaluate the extent of ML capabilities on mobile devices. See Firebase ML Kit for an example. Others include Apple CoreML, TensorFlow Lite, etc
27
DA & ASDPersonalized Advertisements based on Facial RecognitionUsing a tablet device to deliver advertisement, monitor facial reactions of person watching to judge level of interest and attention for adverts. Using this feedback use machine learning to determine what adverts to display next. Gathering of feedback using facial recognition and user engagement build a full platform and architecture to support the delivery of solution
28
ASDReal-time Database Monitoring toolBuild an application to monitor the database and all internal processes in real-time. To provide informative visualizations and data insights on what is happening, using various trend analysis and ML to identify anomalies and alerts. Can this tool be build to work with more than one data vendor?
29
DA & ASDAugmented Data Analysis and Machine LearningBuild an augmented data analysis and machine learning tool. Capable of loading any data set, analyze it, understand it, visualize the data, perform data enrichment, identify feature engineering, identify possible ML algorithms to use. All done automatically, with just a click of a button from the user. All they need to do is specify the data set.
30
DA & ASDData Indexing using Machine LearningCheck out the paper by the Google AI team on using neural networks as an alternative to B-tree indexes. Can you build something similar, can you improve on their design, can other ML algorithms be used, how does this scale, etc. There are lots and lots of possibilities with this project
31
DAAnalysing people musical tastesThis project will look at examining the musical characteristics of persons favorite music. Taking in batches of various sizes to determine the optional number of compositions to determine a style. The music will be broken down into key components and compared across all music in the batch. A similar approach can be used to analyze how music styles have evolved over time for different musicians
32
ASDEvaluation of Low Code development environmentIn recent years a number of Low Code software development environments have evolved. This project will look at evaluating 3-5 of these to examine their features, development effort, developer skills, adoption within enterprises and how these type of low code development environments are will impact in future
33
ASDWalk with me – for the visually impairedThis project will look at using a Raspberry Pi enabled camera to allow the visually imparted to walk down a street un-aided. The camera will constantly scan the environment, taking pictures in real time, scanning these images and then providing voice descriptions of the environment. This will allow the person to visualize their environment. When the person walks the image and data process will detect this and will feed motion related information to the user, such as certain objects are getting closer or moving away. The system will also identify potential hazards such as people, rubbish bins and other obsticals. All image and other process to be performed on a Raspberry Pi
34
DA & ASDReal-time Anomaly Detection of Server or Database Alert logsYou need to have access to server or a database activity logs for this project. Using anomaly detection, along with variations in time-periods, identify unusual activity and provide appropriate level of notification and information about the alert.
35
DASyntetic data generation for imbalanced data setsAn examination of various methods for the generation of synthetic data for imbalanced data sets. Similar to the processing used in SMOTH, additional techniques will be used and evaluated to determine their effectiveness for input to machine learning
36
ASDTwitter profile follower generator using GOUsing Google Go language, build a library for the Twitter API. Then use this library to create an application to allow a user to increase their number of followers on twitter. Various approaches should be evaluated and implemented using the newly created library. The application should identify, based on an existing user profile, how to increase their number of followers.
37
DA & ASDUsing Music and Machine Learning for Database MonitoringCombine your love of music and machine learning to monitor Database activity. This activity will involve monitoring the database engine logging and using a combination of anomaly detection, with moving windows of data selections to identify and capture the moving trends in the activity logs. Then take this activity and compose music (based on your favorite artist or genre) as a reporting mechanism. Then unusual activity is identified then this needs to be reflected in the music.
38
ASDDatabase storage 4.0- Multi storage management for next stage databaseIn the next phase of database management will see an integrated approach to how and where the data is stored. The past decade has seen the push by the Hadoop Eco-system to replace the traditional database. But given the install based of traditional databases and differing analytic requirements a complete migration will not happen. In the multi-storage environment data, within the database, can reside on one or more storage media and locations. These can include in-memory, flash, solid state, disk and also on Hodoop. Based on the information lifecycle management approach for defining where data will reside, a new framework is needed to dynamically management to movement of the data based on the frequency of usage. The project will look at how the data can be dynamically and efficiently migrated between storage media with the minimum of downtime.
39
ASDAutomation of VM builds and migrations using vagrant, ansible, docker and virtual machines, And how to autmate the migration of these to different Cloud vendorsAutomation of VM builds and migrations using vagrant, ansible, docker and virtual machines, And how to automate the migration of these to different Cloud vendors
40
DAEvaluation of AutoML features across languages and toolsThe use of Automated Machine Learning (AutoML) is going to replace the data scientist! Or so they say. This project will evaluate the various AutoML solutions proposed by various vendors and languages to measure how good they really are, and how likely will companies and data scientists trust the use of them.
41
DA & ASDBuilding a repository for continuously evolving self monitoring predictive analytics modelshis project will focus at building the process to manage an autonomous building and rebuild of predictive models within an adaptive intelligence project. You will be working with many predictive and machine learning algorithms, building automated tools for the selection of the appropriate algorithms dependent on the underlying data sets. This project will integrate in with the other Adaptive Intelligence projects with the aim at developing an integrated solution that can easily be deployed in any environment.
42
ASDIs it really possible to build a Big Data cluster using Raspberry PisIn the era of big data and IoTs the cost associated with building a clustered environment can be huge. This project will look at building a Hadoop 5 node cluster using Raspberry Pi and evaluate how effective it is capturing data are different delivery rates. The data can be sent for storage on the cluster using Kafka. The second part will examine the efficiency data analysis using this cluster. [The student will have to purchase the required equipment]
43
ASDEvaluation of Json and complex objects in Oracle, ProgreSQL, SQL Server and DB2Most databases allow the creation of Json objects within the database and the embed these into traditional database tables. This project will examine how the main database vendors have implemented these features and assess their capabilities and ability to scale. Additional most databases allow the creation of nested and other complex data structures. A similar evaluation will be performed on these
44
ASDReplacing the SQL query engine with JavaScript and PythonThis project will perform a detailed evaluation of the Oracle Multi-lingual Engine, benchmarking it’s performance against the traditional SQL query engine.
45
ASD & DAExpanding the analytical capabilities of the Database using the embedded JVMMost enterprise level databases come with an inbuilt JVM. This allows you to create new functions within the database using Java. This project will take a number of machine learning, and various analytical functions and write these in optimized Java code, store these in the database and then evaluate the performance of these features against the existing equivalent functions in the database
46
ASDEvaluation of live application upgrades with zero down timeEvaluation of solutions by leading vendors of live application and database upgrades with zero downtime. For example Oracle for a tool called Edition Based Redefinition. Other vendors have similar products. This project will review these tools and will provide an evaluation and benchmark of their use.
47
ASD & DAEvaluation of Big Data Machine Learning LanguagesThe Apache foundation have a number of machine learning projects. Some of these have a SQL interface. These include HiveMall, MADlib, Storm and others. This project will evaluate the machine learning capabilities of these languages, providing number of worked scenarios. These scenarios will be benchmarked against each other
48
ASDSecurity issues of bi-directional cloud portability for applicationsMany frameworks exist to help developers build applications for cloud native architectures, both those in the cloud as well as those behind the firewall. Applications are becoming more complex with many components sitting in serverless and container environments hosted in the Cloud and behind a corporate firewall. This project will examine the implications of such frameworks and technical architectures and present a number of alternative solutions
49
ASDFn: The next phase of application architecturesThe Fn project is an open-source container-native serverless platform that you can run anywhere — any cloud or on-premise. It’s easy to use, supports every programming language, and is extensible and performant. With Fn, you deploy your functions to an Fn server which automatically executes and manages them. Each function is executed in a Docker container enabling the platform to provide broad support for development languages including Java, JavaScript (Node), Go, Python, Ruby, and others. Fn project has a strong enterprise focus with emphasis on security, scalability, and observability. In serverless, the small piece of code that does all the work is called a Function. And, a serverless cloud service typically provide functions-as-a-service (FAAS). Thus all the plumbing needed to provision, scale, patch and maintain the environment is provided by the service.
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100