data mining Recently Published Documents

Total documents.

  • Latest Documents
  • Most Cited Documents
  • Contributed Authors
  • Related Sources
  • Related Keywords

Distance Based Pattern Driven Mining for Outlier Detection in High Dimensional Big Dataset

Detection of outliers or anomalies is one of the vital issues in pattern-driven data mining. Outlier detection detects the inconsistent behavior of individual objects. It is an important sector in the data mining field with several different applications such as detecting credit card fraud, hacking discovery and discovering criminal activities. It is necessary to develop tools used to uncover the critical information established in the extensive data. This paper investigated a novel method for detecting cluster outliers in a multidimensional dataset, capable of identifying the clusters and outliers for datasets containing noise. The proposed method can detect the groups and outliers left by the clustering process, like instant irregular sets of clusters (C) and outliers (O), to boost the results. The results obtained after applying the algorithm to the dataset improved in terms of several parameters. For the comparative analysis, the accurate average value and the recall value parameters are computed. The accurate average value is 74.05% of the existing COID algorithm, and our proposed algorithm has 77.21%. The average recall value is 81.19% and 89.51% of the existing and proposed algorithm, which shows that the proposed work efficiency is better than the existing COID algorithm.

Implementation of Data Mining Technology in Bonded Warehouse Inbound and Outbound Goods Trade

For the taxed goods, the actual freight is generally determined by multiplying the allocated freight for each KG and actual outgoing weight based on the outgoing order number on the outgoing bill. Considering the conventional logistics is insufficient to cope with the rapid response of e-commerce orders to logistics requirements, this work discussed the implementation of data mining technology in bonded warehouse inbound and outbound goods trade. Specifically, a bonded warehouse decision-making system with data warehouse, conceptual model, online analytical processing system, human-computer interaction module and WEB data sharing platform was developed. The statistical query module can be used to perform statistics and queries on warehousing operations. After the optimization of the whole warehousing business process, it only takes 19.1 hours to get the actual freight, which is nearly one third less than the time before optimization. This study could create a better environment for the development of China's processing trade.

Multi-objective economic load dispatch method based on data mining technology for large coal-fired power plants

User activity classification and domain-wise ranking through social interactions.

Twitter has gained a significant prevalence among the users across the numerous domains, in the majority of the countries, and among different age groups. It servers a real-time micro-blogging service for communication and opinion sharing. Twitter is sharing its data for research and study purposes by exposing open APIs that make it the most suitable source of data for social media analytics. Applying data mining and machine learning techniques on tweets is gaining more and more interest. The most prominent enigma in social media analytics is to automatically identify and rank influencers. This research is aimed to detect the user's topics of interest in social media and rank them based on specific topics, domains, etc. Few hybrid parameters are also distinguished in this research based on the post's content, post’s metadata, user’s profile, and user's network feature to capture different aspects of being influential and used in the ranking algorithm. Results concluded that the proposed approach is well effective in both the classification and ranking of individuals in a cluster.

A data mining analysis of COVID-19 cases in states of United States of America

Epidemic diseases can be extremely dangerous with its hazarding influences. They may have negative effects on economies, businesses, environment, humans, and workforce. In this paper, some of the factors that are interrelated with COVID-19 pandemic have been examined using data mining methodologies and approaches. As a result of the analysis some rules and insights have been discovered and performances of the data mining algorithms have been evaluated. According to the analysis results, JRip algorithmic technique had the most correct classification rate and the lowest root mean squared error (RMSE). Considering classification rate and RMSE measure, JRip can be considered as an effective method in understanding factors that are related with corona virus caused deaths.

Exploring distributed energy generation for sustainable development: A data mining approach

A comprehensive guideline for bengali sentiment annotation.

Sentiment Analysis (SA) is a Natural Language Processing (NLP) and an Information Extraction (IE) task that primarily aims to obtain the writer’s feelings expressed in positive or negative by analyzing a large number of documents. SA is also widely studied in the fields of data mining, web mining, text mining, and information retrieval. The fundamental task in sentiment analysis is to classify the polarity of a given content as Positive, Negative, or Neutral . Although extensive research has been conducted in this area of computational linguistics, most of the research work has been carried out in the context of English language. However, Bengali sentiment expression has varying degree of sentiment labels, which can be plausibly distinct from English language. Therefore, sentiment assessment of Bengali language is undeniably important to be developed and executed properly. In sentiment analysis, the prediction potential of an automatic modeling is completely dependent on the quality of dataset annotation. Bengali sentiment annotation is a challenging task due to diversified structures (syntax) of the language and its different degrees of innate sentiments (i.e., weakly and strongly positive/negative sentiments). Thus, in this article, we propose a novel and precise guideline for the researchers, linguistic experts, and referees to annotate Bengali sentences immaculately with a view to building effective datasets for automatic sentiment prediction efficiently.

Capturing Dynamics of Information Diffusion in SNS: A Survey of Methodology and Techniques

Studying information diffusion in SNS (Social Networks Service) has remarkable significance in both academia and industry. Theoretically, it boosts the development of other subjects such as statistics, sociology, and data mining. Practically, diffusion modeling provides fundamental support for many downstream applications (e.g., public opinion monitoring, rumor source identification, and viral marketing). Tremendous efforts have been devoted to this area to understand and quantify information diffusion dynamics. This survey investigates and summarizes the emerging distinguished works in diffusion modeling. We first put forward a unified information diffusion concept in terms of three components: information, user decision, and social vectors, followed by a detailed introduction of the methodologies for diffusion modeling. And then, a new taxonomy adopting hybrid philosophy (i.e., granularity and techniques) is proposed, and we made a series of comparative studies on elementary diffusion models under our taxonomy from the aspects of assumptions, methods, and pros and cons. We further summarized representative diffusion modeling in special scenarios and significant downstream tasks based on these elementary models. Finally, open issues in this field following the methodology of diffusion modeling are discussed.

The Influence of E-book Teaching on the Motivation and Effectiveness of Learning Law by Using Data Mining Analysis

This paper studies the motivation of learning law, compares the teaching effectiveness of two different teaching methods, e-book teaching and traditional teaching, and analyses the influence of e-book teaching on the effectiveness of law by using big data analysis. From the perspective of law student psychology, e-book teaching can attract students' attention, stimulate students' interest in learning, deepen knowledge impression while learning, expand knowledge, and ultimately improve the performance of practical assessment. With a small sample size, there may be some deficiencies in the research results' representativeness. To stimulate the learning motivation of law as well as some other theoretical disciplines in colleges and universities has particular referential significance and provides ideas for the reform of teaching mode at colleges and universities. This paper uses a decision tree algorithm in data mining for the analysis and finds out the influencing factors of law students' learning motivation and effectiveness in the learning process from students' perspective.

Intelligent Data Mining based Method for Efficient English Teaching and Cultural Analysis

The emergence of online education helps improving the traditional English teaching quality greatly. However, it only moves the teaching process from offline to online, which does not really change the essence of traditional English teaching. In this work, we mainly study an intelligent English teaching method to further improve the quality of English teaching. Specifically, the random forest is firstly used to analyze and excavate the grammatical and syntactic features of the English text. Then, the decision tree based method is proposed to make a prediction about the English text in terms of its grammar or syntax issues. The evaluation results indicate that the proposed method can effectively improve the accuracy of English grammar or syntax recognition.

Export Citation Format

Share document.

PhD in Data Science – Your Guide to Choosing a Doctorate Degree Program

data mining thesis phd

Created by aasif.faizal

Professional opportunities in data science are growing incredibly fast. That’s great news for students looking to pursue a career as a data scientist. But it also means that there are a lot more options out there to investigate and understand before developing the best educational path for you.

A PhD is the most advanced data science degree you can get, reflecting a depth of knowledge and technical expertise that will put you at the top of your field.

phd data science

This means that PhD programs are the most time-intensive degree option out there, typically requiring that students complete dissertations involving rigorous research. This means that PhDs are not for everyone. Indeed, many who work in the world of big data hold master’s degrees rather than PhDs, which tend to involve the same coursework as PhD programs without a dissertation component. However, for the right candidate, a PhD program is the perfect choice to become a true expert on your area of focus.

If you’ve concluded that a data science PhD is the right path for you, this guide is intended to help you choose the best program to suit your needs. It will walk through some of the key considerations while picking graduate data science programs and some of the nuts and bolts (like course load and tuition costs) that are part of the data science PhD decision-making process.

Data Science PhD vs. Masters: Choosing the right option for you

If you’re considering pursuing a data science PhD, it’s worth knowing that such an advanced degree isn’t strictly necessary in order to get good work opportunities. Many who work in the field of big data only hold master’s degrees, which is the level of education expected to be a competitive candidate for data science positions.

So why pursue a data science PhD?

Simply put, a PhD in data science will leave you qualified to enter the big data industry at a high level from the outset.

You’ll be eligible for advanced positions within companies, holding greater responsibilities, keeping more direct communication with leadership, and having more influence on important data-driven decisions. You’re also likely to receive greater compensation to match your rank.

However, PhDs are not for everyone. Dissertations require a great deal of time and an interest in intensive research. If you are eager to jumpstart a career quickly, a master’s program will give you the preparation you need to hit the ground running. PhDs are appropriate for those who want to commit their time and effort to schooling as a long-term investment in their professional trajectory.

For more information on the difference between data science PhD’s and master’s programs, take a look at our guide here.

Topics include:

  • Can I get an Online Ph.D in Data Science?
  • Overview of Ph.d Coursework

Preparing for a Doctorate Program

Building a solid track record of professional experience, things to consider when choosing a school.

  • What Does it Cost to Get a Ph.D in Data Science?
  • School Listings

data analysis graph

Data Science PhD Programs, Historically

Historically, data science PhD programs were one of the main avenues to get a good data-related position in academia or industry. But, PhD programs are heavily research oriented and require a somewhat long term investment of time, money, and energy to obtain. The issue that some data science PhD holders are reporting, especially in industry settings, is that that the state of the art is moving so quickly, and that the data science industry is evolving so rapidly, that an abundance of research oriented expertise is not always what’s heavily sought after.

Instead, many companies are looking for candidates who are up to date with the latest data science techniques and technologies, and are willing to pivot to match emerging trends and practices.

One recent development that is making the data science graduate school decisions more complex is the introduction of specialty master’s degrees, that focus on rigorous but compact, professional training. Both students and companies are realizing the value of an intensive, more industry-focused degree that can provide sufficient enough training to manage complex projects and that are more client oriented, opposed to research oriented.

However, not all prospective data science PhD students are looking for jobs in industry. There are some pretty amazing research opportunities opening up across a variety of academic fields that are making use of new data collection and analysis tools. Experts that understand how to leverage data systems including statistics and computer science to analyze trends and build models will be in high demand.

Can You Get a PhD in Data Science Online?

While it is not common to get a data science Ph.D. online, there are currently two options for those looking to take advantage of the flexibility of an online program.

Indiana University Bloomington and Northcentral University both offer online Ph.D. programs with either a minor or specialization in data science.

Given the trend for schools to continue increasing online offerings, expect to see additional schools adding this option in the near future.

woman data analysis on computer screens

Overview of PhD Coursework

A PhD requires a lot of academic work, which generally requires between four and five years (sometimes longer) to complete.

Here are some of the high level factors to consider and evaluate when comparing data science graduate programs.

How many credits are required for a PhD in data science?

On average, it takes 71 credits to graduate with a PhD in data science — far longer (almost double) than traditional master’s degree programs. In addition to coursework, most PhD students also have research and teaching responsibilities that can be simultaneously demanding and really great career preparation.

What’s the core curriculum like?

In a data science doctoral program, you’ll be expected to learn many skills and also how to apply them across domains and disciplines. Core curriculums will vary from program to program, but almost all will have a core foundation of statistics.

All PhD candidates will have to take a qualifying exam. This can vary from university to university, but to give you some insight, it is broken up into three phases at Yale. They have a practical exam, a theory exam and an oral exam. The goal is to make sure doctoral students are developing the appropriate level of expertise.


One of the final steps of a PhD program involves presenting original research findings in a formal document called a dissertation. These will provide background and context, as well as findings and analysis, and can contribute to the understanding and evolution of data science. A dissertation idea most often provides the framework for how a PhD candidate’s graduate school experience will unfold, so it’s important to be thoughtful and deliberate while considering research opportunities.

Since data science is such a rapidly evolving field and because choosing the right PhD program is such an important factor in developing a successful career path, there are some steps that prospective doctoral students can take in advance to find the best-fitting opportunity.

Join professional associations

Even before being fully credentials, joining professional associations and organizations such as the Data Science Association and the American Association of Big Data Professionals is a good way to get exposure to the field. Many professional societies are welcoming to new members and even encourage student participation with things like discounted membership fees and awards and contest categories for student researchers. One of the biggest advantages to joining is that these professional associations bring together other data scientists for conference events, research-sharing opportunities, networking and continuing education opportunities.

Leverage your social network

Be on the lookout to make professional connections with professors, peers, and members of industry. There are a number of LinkedIn groups dedicated to data science. A well-maintained professional network is always useful to have when looking for advice or letters of recommendation while applying to graduate school and then later while applying for jobs and other career-related opportunities.

Kaggle competitions

Kaggle competitions provide the opportunity to solve real-world data science problems and win prizes. A list of data science problems can be found at . Winning one of these competitions is a good way to demonstrate professional interest and experience.


Internships are a great way to get real-world experience in data science while also getting to work for top names in the world of business. For example, IBM offers a data science internship which would also help to stand out when applying for PhD programs, as well as in seeking employment in the future.

Demonstrating professional experience is not only important when looking for jobs, but it can also help while applying for graduate school. There are a number of ways for prospective students to gain exposure to the field and explore different facets of data science careers.

Get certified

There are a number of data-related certificate programs that are open to people with a variety of academic and professional experience. DeZyre has an excellent guide to different certifications, some of which might help provide good background for graduate school applications.


Conferences are a great place to meet people presenting new and exciting research in the data science field and bounce ideas off of newfound connections. Like professional societies and organizations, discounted student rates are available to encourage student participation. In addition, some conferences will waive fees if you are presenting a poster or research at the conference, which is an extra incentive to present.

teacher in full classroom of students

It can be hard to quantify what makes a good-fit when it comes to data science graduate school programs. There are easy to evaluate factors, such as cost and location, and then there are harder to evaluate criteria such as networking opportunities, accessibility to professors, and the up-to-dateness of the program’s curriculum.

Nevertheless, there are some key relevant considerations when applying to almost any data science graduate program.

What most schools will require when applying:

  • All undergraduate and graduate transcripts
  • A statement of intent for the program (reason for applying and future plans)
  • Letters of reference
  • Application fee
  • Online application
  • A curriculum vitae (outlining all of your academic and professional accomplishments)

What Does it Cost to Get a PhD in Data Science?

The great news is that many PhD data science programs are supported by fellowships and stipends. Some are completely funded, meaning the school will pay tuition and basic living expenses. Here are several examples of fully funded programs:

  • University of Southern California
  • University of Nevada, Reno
  • Kennesaw State University
  • Worcester Polytechnic Institute
  • University of Maryland

For all other programs, the average range of tuition, depending on the school can range anywhere from $1,300 per credit hour to $2,000 amount per credit hour. Remember, typical PhD programs in data science are between 60 and 75 credit hours, meaning you could spend up to $150,000 over several years.

That’s why the financial aspects are so important to evaluate when assessing PhD programs, because some schools offer full stipends so that you are able to attend without having to find supplemental scholarships or tuition assistance.

Can I become a professor of data science with a PhD.? Yes! If you are interested in teaching at the college or graduate level, a PhD is the degree needed to establish the full expertise expected to be a professor. Some data scientists who hold PhDs start by entering the field of big data and pivot over to teaching after gaining a significant amount of work experience. If you’re driven to teach others or to pursue advanced research in data science, a PhD is the right degree for you.

Do I need a master’s in order to pursue a PhD.? No. Many who pursue PhDs in Data Science do not already hold advanced degrees, and many PhD programs include all the coursework of a master’s program in the first two years of school. For many students, this is the most time-effective option, allowing you to complete your education in a single pass rather than interrupting your studies after your master’s program.

Can I choose to pursue a PhD after already receiving my master’s? Yes. A master’s program can be an opportunity to get the lay of the land and determine the specific career path you’d like to forge in the world of big data. Some schools may allow you to simply extend your academic timeline after receiving your master’s degree, and it is also possible to return to school to receive a PhD if you have been working in the field for some time.

If a PhD. isn’t necessary, is it a waste of time? While not all students are candidates for PhDs, for the right students – who are keen on doing in-depth research, have the time to devote to many years of school, and potentially have an interest in continuing to work in academia – a PhD is a great choice. For more information on this question, take a look at our article Is a Data Science PhD. Worth It?

Complete List of Data Science PhD Programs

Below you will find the most comprehensive list of schools offering a doctorate in data science. Each school listing contains a link to the program specific page, GRE or a master’s degree requirements, and a link to a page with detailed course information.

Note that the listing only contains true data science programs. Other similar programs are often lumped together on other sites, but we have chosen to list programs such as data analytics and business intelligence on a separate section of the website.

Boise State University  – Boise, Idaho PhD in Computing – Data Science Concentration

The Data Science emphasis focuses on the development of mathematical and statistical algorithms, software, and computing systems to extract knowledge or insights from data.  

In 60 credits, students complete an Introduction to Graduate Studies, 12 credits of core courses, 6 credits of data science elective courses, 10 credits of other elective courses, a Doctoral Comprehensive Examination worth 1 credit, and a 30-credit dissertation.

Electives can be taken in focus areas such as Anthropology, Biometry, Ecology/Evolution and Behavior, Econometrics, Electrical Engineering, Earth Dynamics and Informatics, Geoscience, Geostatistics, Hydrology and Hydrogeology, Materials Science, and Transportation Science.

Delivery Method: Campus GRE: Required 2022-2023 Tuition: $7,236 total (Resident), $24,573 total (Non-resident)

View Course Offerings

Bowling Green State University  – Bowling Green, Ohio Ph.D. in Data Science

Data Science students at Bowling Green intertwine knowledge of computer science with statistics.

Students learn techniques in analyzing structured, unstructured, and dynamic datasets.

Courses train students to understand the principles of analytic methods and articulating the strengths and limitations of analytical methods.

The program requires 60 credit hours in the studies of Computer Science (6 credit hours), Statistics (6 credit hours), Data Science Exploration and Communication, Ethical Issues, Advanced Data Mining, and Applied Data Science Experience.

Students must also complete 21 credit hours of elective courses, a qualifying exam, a preliminary exam, and a dissertation.

Delivery Method: Campus GRE: Required 2022-2023 Tuition: $8,418 (Resident), $14,410 (Non-resident)

Brown University  – Providence, Rhode Island PhD in Computer Science – Concentration in Data Science

Brown University’s database group is a world leader in systems-oriented database research; they seek PhD candidates with strong system-building skills who are interested in researching TupleWare, MLbase, MDCC, Crowd DB, or PIQL.

In order to gain entrance, applicants should consider first doing a research internship at Brown with this group. Other ways to boost an application are to take and do well at massive open online courses, do an internship at a large company, and get involved in a large open-source software project.

Coding well in C++ is preferred.

Delivery Method: Campus GRE: Required 2022-2023 Tuition: $62,680 total

Chapman University  – Irvine, California Doctorate in Computational and Data Sciences

Candidates for the doctorate in computational and data science at Chapman University begin by completing 13 core credits in basic methodologies and techniques of computational science.

Students complete 45 credits of electives, which are personalized to match the specific interests and research topics of the student.

Finally, students complete up to 12 credits in dissertation research.

Applicants must have completed courses in differential equations, data structures, and probability and statistics, or take specific foundation courses, before beginning coursework toward the PhD.

Delivery Method: Campus GRE: Required 2022-2023 Tuition: $37,538 per year

Clemson University / Medical University of South Carolina (MUSC) – Joint Program – Clemson, South Carolina & Charleston, South Carolina Doctor of Philosophy in Biomedical Data Science and Informatics – Clemson

The PhD in biomedical data science and informatics is a joint program co-authored by Clemson University and the Medical University of South Carolina (MUSC).

Students choose one of three tracks to pursue: precision medicine, population health, and clinical and translational informatics. Students complete 65-68 credit hours, and take courses in each of 5 areas: biomedical informatics foundations and applications; computing/math/statistics/engineering; population health, health systems, and policy; biomedical/medical domain; and lab rotations, seminars, and doctoral research.

Applicants must have a bachelor’s in health science, computing, mathematics, statistics, engineering, or a related field, and it is recommended to also have competency in a second of these areas.

Program requirements include a year of calculus and college biology, as well as experience in computer programming.

Delivery Method: Campus GRE: Required 2022-2023 Tuition: $10,858 total (South Carolina Resident), $22,566 total (Non-resident)

View Course Offerings – Clemson

George Mason University  – Fairfax, Virginia Doctor of Philosophy in Computational Sciences and Informatics – Emphasis in Data Science

George Mason’s PhD in computational sciences and informatics requires a minimum of 72 credit hours, though this can be reduced if a student has already completed a master’s. 48 credits are toward graduate coursework, and an additional 24 are for dissertation research.

Students choose an area of emphasis—either computer modeling and simulation or data science—and completed 18 credits of the coursework in this area. Students are expected to completed the coursework in 4-5 years.

Applicants to this program must have a bachelor’s degree in a natural science, mathematics, engineering, or computer science, and must have knowledge and experience with differential equations and computer programming.

Delivery Method: Campus GRE: Required 2022-2023 Tuition: $13,426 total (Virginia Resident), $35,377 total (Non-resident)

Harrisburg University of Science and Technology  – Harrisburg, Pennsylvania Doctor of Philosophy in Data Sciences

Harrisburg University’s PhD in data science is a 4-5 year program, the first 2 of which make up the Harrisburg master’s in analytics.

Beyond this, PhD candidates complete six milestones to obtain the degree, including 18 semester hours in doctoral-level courses, such as multivariate data analysis, graph theory, machine learning.

Following the completion of ANLY 760 Doctoral Research Seminar, students in the program complete their 12 hours of dissertation research bringing the total program hours to 36.

Delivery Method: Campus GRE: Required 2022-2023 Tuition: $14,940 total

Icahn School of Medicine at Mount Sinai  – New York, New York Genetics and Data Science, PhD

As part of the Biomedical Science PhD program, the Genetics and Data Science multidisciplinary training offers research opportunities that expand on genetic research and modern genomics. The training also integrates several disciplines of biomedical sciences with machine learning, network modeling, and big data analysis.

Students in the Genetics and Data Science program complete a predetermined course schedule with a total of 64 credits and 3 years of study.

Additional course requirements and electives include laboratory rotations, a thesis proposal exam and thesis defense, Computer Systems, Intro to Algorithms, Machine Learning for Biomedical Data Science, Translational Genomics, and Practical Analysis of a Personal Genome.

Delivery Method: Campus GRE: Not Required 2022-2023 Tuition: $31,303 total

Indiana University-Purdue University Indianapolis  – Indianapolis, Indiana PhD in Data Science PhD Minor in Applied Data Science

Doctoral candidates pursuing the PhD in data science at Indiana University-Purdue must display competency in research, data analytics, and at management and infrastructure to earn the degree.

The PhD is comprised of 24 credits of a data science core, 18 credits of methods courses, 18 credits of a specialization, written and oral qualifying exams, and 30 credits of dissertation research. All requirements must be completed within 7 years.

Applicants are generally expected to have a master’s in social science, health, data science, or computer science. 

Currently a majority of the PhD students at IUPUI are funded by faculty grants and two are funded by the federal government. None of the students are self funded.

IUPUI also offers a PhD Minor in Applied Data Science that is 12-18 credits. The minor is open to students enrolled at IUPUI or IU Bloomington in a doctoral program other than Data Science.

Delivery Method: Campus GRE: Required 2022-2023 Tuition: $9,228 per year (Indiana Resident), $25,368 per year (Non-resident)

Jackson State University – Jackson, Mississippi PhD Computational and Data-Enabled Science and Engineering

Jackson State University offers a PhD in computational and data-enabled science and engineering with 5 concentration areas: computational biology and bioinformatics, computational science and engineering, computational physical science, computation public health, and computational mathematics and social science.

Students complete 12 credits of common core courses, 12 credits in the specialization, 24 credits of electives, and 24 credits in dissertation research.

Students may complete the doctoral program in as little as 5 years and no more than 8 years.

Delivery Method: Campus GRE: Required 2022-2023 Tuition: $8,270 total

Kennesaw State University  – Kennesaw, Georgia PhD in Analytics and Data Science

Students pursuing a PhD in analytics and data science at Kennesaw State University must complete 78 credit hours: 48 course hours and 6 electives (spread over 4 years of study), a minimum 12 credit hours for dissertation research, and a minimum 12 credit-hour internship.

Prior to dissertation research, the comprehensive examination will cover material from the three areas of study: computer science, mathematics, and statistics.

Successful applicants will have a master’s degree in a computational field, calculus I and II, programming experience, modeling experience, and are encouraged to have a base SAS certification.

Delivery Method: Campus GRE: Required 2022-2023 Tuition: $5,328 total (Georgia Resident), $19,188 total (Non-resident)

New Jersey Institute of Technology  – Newark, New Jersey PhD in Business Data Science

Students may enter the PhD program in business data science at the New Jersey Institute of Technology with either a relevant bachelor’s or master’s degree. Students with bachelor’s degrees begin with 36 credits of advanced courses, and those with master’s take 18 credits before moving on to credits in dissertation research.

Core courses include business research methods, data mining and analysis, data management system design, statistical computing with SAS and R, and regression analysis.

Students take qualifying examinations at the end of years 1 and 2, and must defend their dissertations successfully by the end of year 6.

Delivery Method: Campus GRE: Required 2022-2023 Tuition: $21,932 total (New Jersey Resident), $32,426 total (Non-resident)

New York University  – New York, New York PhD in Data Science

Doctoral candidates in data science at New York University must complete 72 credit hours, pass a comprehensive and qualifying exam, and defend a dissertation with 10 years of entering the program.

Required courses include an introduction to data science, probability and statistics for data science, machine learning and computational statistics, big data, and inference and representation.

Applicants must have an undergraduate or master’s degree in fields such as mathematics, statistics, computer science, engineering, or other scientific disciplines. Experience with calculus, probability, statistics, and computer programming is also required.

Delivery Method: Campus GRE: Required 2022-2023 Tuition: $37,332 per year

View Course Offering

Northcentral University  – San Diego, California PhD in Data Science-TIM

Northcentral University offers a PhD in technology and innovation management with a specialization in data science.

The program requires 60 credit hours, including 6-7 core courses, 3 in research, a PhD portfolio, and 4 dissertation courses.

The data science specialization requires 6 courses: data mining, knowledge management, quantitative methods for data analytics and business intelligence, data visualization, predicting the future, and big data integration.

Applicants must have a master’s already.

Delivery Method: Online GRE: Required 2022-2023 Tuition: $16,794 total

Stevens Institute of Technology – Hoboken, New Jersey Ph.D. in Data Science

Stevens Institute of Technology has developed a data science Ph.D. program geared to help graduates become innovators in the space.

The rigorous curriculum emphasizes mathematical and statistical modeling, machine learning, computational systems and data management.

The program is directed by Dr. Ted Stohr, a recognized thought leader in the information systems, operations and business process management arenas.

Delivery Method: Campus GRE: Required 2022-2023 Tuition: $39,408 per year

University at Buffalo – Buffalo, New York PhD Computational and Data-Enabled Science and Engineering

The curriculum for the University of Buffalo’s PhD in computational and data-enabled science and engineering centers around three areas: data science, applied mathematics and numerical methods, and high performance and data intensive computing. 9 credit course of courses must be completed in each of these three areas. Altogether, the program consists of 72 credit hours, and should be completed in 4-5 years. A master’s degree is required for admission; courses taken during the master’s may be able to count toward some of the core coursework requirements.

Delivery Method: Campus GRE: Required 2022-2023 Tuition: $11,310 per year (New York Resident), $23,100 per year (Non-resident)

University of Colorado Denver – Denver, Colorado PhD in Big Data Science and Engineering

The University of Colorado – Denver offers a unique program for those students who have already received admission to the computer science and information systems PhD program.

The Big Data Science and Engineering (BDSE) program is a PhD fellowship program that allows selected students to pursue research in the area of big data science and engineering. This new fellowship program was created to train more computer scientists in data science application fields such as health informatics, geosciences, precision and personalized medicine, business analytics, and smart cities and cybersecurity.

Students in the doctoral program must complete 30 credit hours of computer science classes beyond a master’s level, and 30 credit hours of dissertation research.

The BDSE fellowship requires students to have an advisor both in the core disciplines (either computer science or mathematics and statistics) as well as an advisor in the application discipline (medicine and public health, business, or geosciences).

In addition, the fellowship covers full stipend, tuition, and fees up to ~50k for BDSE fellows annually. Important eligibility requirements can be found here.

Delivery Method: Campus GRE: Required 2022-2023 Tuition: $55,260 total

University of Marylan d  – College Park, Maryland PhD in Information Studies

Data science is a potential research area for doctoral candidates in information studies at the University of Maryland – College Park. This includes big data, data analytics, and data mining.

Applicants for the PhD must have taken the following courses in undergraduate studies: programming languages, data structures, design and analysis of computer algorithms, calculus I and II, and linear algebra.

Students must complete 6 qualifying courses, 2 elective graduate courses, and at least 12 credit hours of dissertation research.

Delivery Method: Campus GRE: Required 2022-2023 Tuition: $16,238 total (Maryland Resident), $35,388 total (Non-resident)

University of Massachusetts Boston  – Boston, Massachusetts PhD in Business Administration – Information Systems for Data Science Track

The University of Massachusetts – Boston offers a PhD in information systems for data science. As this is a business degree, students must complete coursework in their first two years with a focus on data for business; for example, taking courses such as business in context: markets, technologies, and societies.

Students must take and pass qualifying exams at the end of year 1, comprehensive exams at the end of year 2, and defend their theses at the end of year 4.

Those with a degree in statistics, economics, math, computer science, management sciences, information systems, and other related fields are especially encouraged, though a quantitative degree is not necessary.

Students accepted by the program are ordinarily offered full tuition credits and a stipend ($25,000 per year) to cover educational expenses and help defray living costs for up to three years of study.

During the first two years of coursework, they are assigned to a faculty member as a research assistant; for the third year students will be engaged in instructional activities. Funding for the fourth year is merit-based from a limited pool of program funds

Delivery Method: Campus GRE: Required 2022-2023 Tuition: $18,894 total (in-state), $36,879 (out-of-state)

University of Nevada Reno – Reno, Nevada PhD in Statistics and Data Science

The University of Nevada – Reno’s doctoral program in statistics and data science is comprised of 72 credit hours to be completed over the course of 4-5 years. Coursework is all within the scope of statistics, with titles such as statistical theory, probability theory, linear models, multivariate analysis, statistical learning, statistical computing, time series analysis.

The completion of a Master’s degree in mathematics or statistics prior to enrollment in the doctoral program is strongly recommended, but not required.

Delivery Method: Campus GRE: Required 2022-2023 Tuition: $5,814 total (in-state), $22,356 (out-of-state)

University of Southern California – Los Angles, California PhD in Data Sciences & Operations

USC Marshall School of Business offers a PhD in data sciences and operations to be completed in 5 years.

Students can choose either a track in operations management or in statistics. Both tracks require 4 courses in fall and spring of the first 2 years, as well as a research paper and courses during the summers. Year 3 is devoted to dissertation preparation and year 4 and/or 5 to dissertation defense.

A bachelor’s degree is necessary for application, but no field or further experience is required.

Students should complete 60 units of coursework. If the students are admitted with Advanced Standing (e.g., Master’s Degree in appropriate field), this requirement may be reduced to 40 credits.

Delivery Method: Campus GRE: Required 2022-2023 Tuition: $63,468 total

University of Tennessee-Knoxville  – Knoxville, Tennessee The Data Science and Engineering PhD

The data science and engineering PhD at the University of Tennessee – Knoxville requires 36 hours of coursework and 36 hours of dissertation research. For those entering with an MS degree, only 24 hours of course work is required.

The core curriculum includes work in statistics, machine learning, and scripting languages and is enhanced by 6 hours in courses that focus either on policy issues related to data, or technology entrepreneurship.

Students must also choose a knowledge specialization in one of these fields: health and biological sciences, advanced manufacturing, materials science, environmental and climate science, transportation science, national security, urban systems science, and advanced data science.

Applicants must have a bachelor’s or master’s degree in engineering or a scientific field. 

All students that are admitted will be supported by a research fellowship and tuition will be included.

Many students will perform research with scientists from Oak Ridge national lab, which is located about 30 minutes drive from campus.

Delivery Method: Campus GRE: Required 2022-2023 Tuition: $11,468 total (Tennessee Resident), $29,656 total (Non-resident)

University of Vermont – Burlington, Vermont Complex Systems and Data Science (CSDS), PhD

Through the College of Engineering and Mathematical Sciences, the Complex Systems and Data Science (CSDS) PhD program is pan-disciplinary and provides computational and theoretical training. Students may customize the program depending on their chosen area of focus.

Students in this program work in research groups across campus.

Core courses include Data Science, Principles of Complex Systems and Modeling Complex Systems. Elective courses include Machine Learning, Complex Networks, Evolutionary Computation, Human/Computer Interaction, and Data Mining.

The program requires at least 75 credits to graduate with approval by the student graduate studies committee.

Delivery Method: Campus GRE: Not Required 2022-2023 Tuition: $12,204 total (Vermont Resident), $30,960 total (Non-resident)

University of Washington Seattle Campus – Seattle, Washington PhD in Big Data and Data Science

The University of Washington’s PhD program in data science has 2 key goals: training of new data scientists and cyberinfrastructure development, i.e., development of open-source tools and services that scientists around the world can use for big data analysis.

Students must take core courses in data management, machine learning, data visualization, and statistics.

Students are also required to complete at least one internship that covers practical work in big data.

Delivery Method: Campus GRE: Required 2022-2023 Tuition: $17,004 per year (Washington resident), $30,477 (non-resident)

University of Wisconsin-Madison – Madison, Wisconsin PhD in Biomedical Data Science

The PhD program in Biomedical Data Science offered by the Department of Biostatistics and Medical Informatics at UW-Madison is unique, in blending the best of statistics and computer science, biostatistics and biomedical informatics. 

Students complete three year-long course sequences in biostatistics theory and methods, computer science/informatics, and a specialized sequence to fit their interests.

Students also complete three research rotations within their first two years in the program, to both expand their breadth of knowledge and assist in identifying a research advisor.

Delivery Method: Campus GRE: Required 2022-2023 Tuition: $10,728 total (in-state), $24,054 total (out-of-state)

Vanderbilt University – Nashville, Tennessee Data Science Track of the BMI PhD Program

The PhD in biomedical informatics at Vanderbilt has the option of a data science track.

Students complete courses in the areas of biomedical informatics (3 courses), computer science (4 courses), statistical methods (4 courses), and biomedical science (2 courses). Students are expected to complete core courses and defend their dissertations within 5 years of beginning the program.

Applicants must have a bachelor’s degree in computer science, engineering, biology, biochemistry, nursing, mathematics, statistics, physics, information management, or some other health-related field.

Delivery Method: Campus GRE: Required 2022-2023 Tuition: $53,160 per year

Washington University in St. Louis – St. Louis, Missouri Doctorate in Computational & Data Sciences

Washington University now offers an interdisciplinary Ph.D. in Computational & Data Sciences where students can choose from one of four tracks (Computational Methodologies, Political Science, Psychological & Brain Sciences, or Social Work & Public Health).

Students are fully funded and will receive a stipend for at least five years contingent on making sufficient progress in the program.

Delivery Method: Campus GRE: Required 2022-2023 Tuition: $59,420 total

Worcester Polytechnic Institute – Worcester, Massachusetts PhD in Data Science

The PhD in data science at Worcester Polytechnic Institute focuses on 5 areas: integrative data science, business intelligence and case studies, data access and management, data analytics and mining, and mathematical analysis.

Students first complete a master’s in data science, and then complete 60 credit hours beyond the master’s, including 30 credit hours of research.

Delivery Method: Campus GRE: Required 2022-2023 Tuition: $28,980 per year

Yale University – New Haven, Connecticut PhD Program – Department of Stats and Data Science

The PhD in statistics and data science at Yale University offers broad training in the areas of statistical theory, probability theory, stochastic processes, asymptotics, information theory, machine learning, data analysis, statistical computing, and graphical methods. Students complete 12 courses in the first year in these topics.

Students are required to teach one course each semester of their third and fourth years.

Most students complete and defend their dissertations in their fifth year.

Applicants should have an educational background in statistics, with an undergraduate major in statistics, mathematics, computer science, or similar field.

Delivery Method: Campus GRE: Required 2022-2023 Tuition: $46,900 total

data mining thesis phd

  • Related Programs

wiley university servieces logo


Data Mining Dissertation Topics

           The term “data mining” refers to an intelligent data lookup capacity that uses statistics-based algorithms and methodologies to find trends, patterns, links, and correlations within the collected data and records. Audio, Pictorial, Video, textual, online, and social media-based mining are only a few examples of data mining. This article will provide you with a complete overview of various recent data mining dissertation topics . Let us first start with the definition of data mining processes.  

Trending Data Mining Dissertation Topics for Research Scholars

What is the data mining process?

  • The practice of evaluating a huge batch containing data to find different patterns is known as data mining.
  • Companies can utilize data mining for a variety of purposes, including knowing as to what consumers are engaged in or would like to buy, as well as detection of fraudulent activities and malware scanning.

Hence data mining plays a very significant role in both commercial and personal life aspects of the modern world. We have been working on data mining dissertation topics and project ideas for more than 15 years as a result of which we have gained huge expertise and have acquired vast knowledge, skills, and experience in the field. So we can guide you in all the existing and normal data mining methods and techniques. Let us now talk about the data mining techniques below  

Data mining techniques 

  • Neural networks
  • Rule induction
  • Nearest neighbor classification
  • Decision tree
  • Descriptive techniques – sequential analysis, association, and clustering

Complete explanation and description on all these techniques and methods are available at our website on data mining dissertation topics . By understanding the importance of data mining, we have successfully worked out several advanced projects and implementations in real-time . Check out our website for all details about our successful projects in data mining. Let us now see about the data mining approaches below  

Approaches in data mining

  • Belief nets
  • Neural nets (Kohonen and backpropagation)
  • Decision trees (CHAID, CAITT, and C 4.5)
  • Rules (genetic algorithms and induction)
  • Case-based reasoning
  • Nearest neighbor

This is the basic classification of the various data mining approaches that are in use today. With the support of the best engineers and world-class certified experts in data mining , we are here to provide you with a massive amount of reliable and authentic research data along with complete support in interpretation, analysis, and understanding them . Get in touch with us at any time for complete support for your data mining dissertation . We assure to give you full support and ultimate guidance on any data mining dissertation topics.  We will now talk about the major issues in data mining

Major issues in data mining

  • Parallel, distributed, and incremental mining algorithms
  • Data mining algorithm efficiency and scalability
  • Incorporation of background data
  • Interactive meaning
  • Data mining result presentation and visualization
  • Pattern evaluation meaning
  • pattern and Constraint guided mining
  • Power boosting in networking environment
  • Data mining interdisciplinary approach
  • Data insufficiency and uncertainty
  • Handling the issues of noise
  • Multidimensional data mining space
  • Novel approaches and incorporating multiple aspects of data mining

We have handled all these issues efficiently and have devised successful methods to overcome them. Get in touch with us to know more about the potential data mining solutions and advanced techniques used in overcoming the issues of data mining . What are the top data mining topics?  

Top 5 Data Mining Dissertation Topics

  • Given the widespread prevalence of interconnected, actual data repositories, application domains such as biology, social media, and confidentiality regulation frequently face uncertainties.
  • These unpredictabilities and ambiguities also pervade the visualizations.
  • This issue necessitates the development of novel data mining initiatives capable of capturing the nonlinear relationships between network nodes.
  • This collection of fundamental-level data mining initiatives will aid in the development of a solid foundation in core programming ideas.
  • On a solitary ambiguous graphic representation, one such approach is common subgraph as well as pattern recognition.
  • Deployment of verification oriented as well as pruning procedures to expand the algorithms to desired interpretations
  • Computational exchange methods to improve mining efficiency
  • An iteration and evaluation technique for processing with probability-based semantics
  • An estimation approach for problem-solving efficiency
  • Systems for recognition of patterns, suggestions, copyright infringement, and other web programs utilize pattern matching methods.
  • Usually, the technique uses the Position Hashing and LSH strategy, which is a min-hashing control application, to respond to the nearest-neighbor requests.
  • It may be used in a variety of mathematical models with huge data sets, such as MapReduce and broadcasting.
  • Referencing data mining projects as your career can make it stand out from the crowd.
  • Nevertheless, robust LSH-based filtration and layout are required for dynamic datasets.
  • The effective pattern matching project surpasses prior methods in this regard.
  • Implies a nearest-neighbor database schema for changeable data streams
  • Recommends a matching estimation technique based on drawing
  • It depends on the Jaccard score as a similarity metric
  • This initiative is about a post-publishing service that allows authorized users to post textual data and image postings as well as write remarks on them.
  • Individuals must personally look through several remarks to screen apart certified remarks, good comments, bad remarks, and so forth within the present methodology
  • Users can verify the status of their post using the sentiment analysis and opinion mining technology without putting in a lot amount of work
  • It offers a viewpoint on remarks made on an article as well as the ability to observe a chart.
  • Negative sequences (NSPs) are more informative compared to the positive sequences in behavior analytics or positive sequential patterns or PSPs
  • For example, data about delaying healthcare could be more relevant than information on completing a major surgical operation in a sickness or ailment research.
  • NSP mining, on the other hand, is still in its infancy.
  • While the ‘Topk-NSP+’ algorithm is a dependable option for addressing the new mining-based challenges.
  • Using the current approach, mine the top-k PSPs
  • Using a method identical to that used to mine the top-k PSPs, mine the to-k NSPs out of these PSPs.
  • Using various optimizing methodologies to find effective NSPs while lowering the computational burden

In recent years, there has been a spike in demand for data mining and associated sectors. You could stay up with the current tendencies and advancements using the data mining projects and subjects listed above. So, maintain your curiosity stimulated and the knowledge updated.

  • This is indeed a realistic data mining application that will be beneficial in the long run.
  • Considering the user account data collection that largest social networking companies, like internet dating websites, preserve and manage with them.
  • The individuals who are inquiring about categories are matched with selective criteria by which the respective profiles are correlated with those of other members.
  • This method must be safe enough to defend against unwanted data theft of any kind.
  • To protect user privacy, various methods are today being used which include encryption algorithms and numerous sites to authenticate profile page details of the users

We have successfully delivered all these project topics and dissertation works . Our technical team and writers are highly qualified and are intended solely to establish successful projects into reality. So you can readily contact our customer support facility anytime regarding doubts and queries related to data mining . Let us now see about data mining implementation tools below

Data Mining Tools

  • WEKA, Orange, Tanagra and NLTK
  • Angoss, Oracle, and STATISTICA (or StatSoft)
  • Pentaho, Rattle, and Apache Mahout
  • RapidMiner, R – programming, and KNIME
  • JHepWork, IBM SPSS, and SAS Enterprise Miner

The tips and advice in using these tools of data mining are explained in detail on our website. Also, we are here to help you in handling these data mining tools efficiently with proper demonstrations and explanations. Our engineers have great skills in working with these data mining tools. So reach out to us for any support related to data mining. What are the recent trends in data mining?  

Latest trends in data mining

  • Spatial data mining and semantic web mining
  • Personalized systems for recommendations and low-quality source data mining
  • Data retrieval based on content and multimedia retrieval
  • Graph theory data retrieval and data mining quantum computing
  • Integration of data warehousing and DNA
  • Retrieval based on content and audio mining at low quality
  • Itemset mining for optimization of MapReduce
  • Analyzing sentiments on social media and P2P
  • Assessing the quality of multimedia and Internet of Things applications using data mining
  • Management based on grid databases and Context-aware computing

At present we are offering complete project support and dissertation writing guidance along with assignments, paper publication, proposal, thesis, and many more with proper grammatical checks, full review, and approval. Therefore we are here to help you in all aspects of your data mining research . What are the Datasets available for data mining?  

Datasets for Data Mining Projects

  • It is a data marketplace and open catalog
  • With infochimps, you shall perform sharing, selling, curative, and data downloading
  • It has blogs of about forty-four million
  • It ranges from August to October of 2008
  • Artificial intelligence-based photos and data collection
  • Useful for academic and research purposes
  • Collection of geospatial and geographic data
  • Artificial intelligence and machine learning-based updated data collection
  • Data is collected from around ten thousand Europe based companies
  • It is a repository of molecular abundance and gene expression
  • It supports MIAME compliances
  • Retrieving, querying, and browsing data is made possible with this gene expression resource
  • Collection of stocks and futures-based financial data
  • Google-based text collection from various books

Apart from these relevant datasets, there are also many other datasets including CIDDS, DAPARA, CICIDS2017, ADFA – IDS, TUIDS, ISCXIDS2012, AWID, and NSL – KDD . Complete information on all these datasets and tips for handling them efficiently will be shared with you as you avail of our services on data mining dissertation topics . Feel free to interact with our experts regarding any doubts in your data mining research. We ensure to solve all your doubts instantly.

data mining thesis phd

Opening Hours

  • Mon-Sat 09.00 am – 6.30 pm
  • Lunch Time 12.30 pm – 01.30 pm
  • Break Time 04.00 pm – 04.30 pm
  • 18 years service excellence
  • 40+ country reach
  • 36+ university mou
  • 194+ college mou
  • 6000+ happy customers
  • 100+ employees
  • 240+ writers
  • 60+ developers
  • 45+ researchers
  • 540+ Journal tieup

Payment Options

money gram

Our Clients

data mining thesis phd

Social Links

data mining thesis phd

  • Terms of Use

data mining thesis phd

Opening Time

data mining thesis phd

Closing Time

  • We follow Indian time zone


  • Bibliography
  • More Referencing guides Blog Automated transliteration Relevant bibliographies by topics
  • Automated transliteration
  • Relevant bibliographies by topics
  • Referencing guides

Dissertations / Theses on the topic 'Data mining'

Create a spot-on reference in apa, mla, chicago, harvard, and other styles.

Consult the top 50 dissertations / theses for your research on the topic 'Data mining.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

Mrázek, Michal. "Data mining." Master's thesis, Vysoké učení technické v Brně. Fakulta strojního inženýrství, 2019.

Payyappillil, Hemambika. "Data mining framework." Morgantown, W. Va. : [West Virginia University Libraries], 2005.

Abedjan, Ziawasch. "Improving RDF data with data mining." Phd thesis, Universität Potsdam, 2014.

Liu, Tantan. "Data Mining over Hidden Data Sources." The Ohio State University, 2012.

Taylor, Phillip. "Data mining of vehicle telemetry data." Thesis, University of Warwick, 2015.

Sherikar, Vishnu Vardhan Reddy. "I2MAPREDUCE: DATA MINING FOR BIG DATA." CSUSB ScholarWorks, 2017.

Zhang, Nan. "Privacy-preserving data mining." [College Station, Tex. : Texas A&M University, 2006.

Hulten, Geoffrey. "Mining massive data streams /." Thesis, Connect to this title online; UW restricted, 2005.

Büchel, Nina. "Faktorenvorselektion im Data Mining /." Berlin : Logos, 2009.

Shao, Junming. "Synchronization Inspired Data Mining." Diss., lmu, 2011.

Wang, Xiaohong. "Data mining with bilattices." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 2001.

Knobbe, Arno J. "Multi-relational data mining /." Amsterdam [u.a.] : IOS Press, 2007.

丁嘉慧 and Ka-wai Ting. "Time sequences: data mining." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2001.

Wan, Chang, and 萬暢. "Mining multi-faceted data." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2013.

García-Osorio, César. "Data mining and visualization." Thesis, University of Exeter, 2005.

Wang, Grant J. (Grant Jenhorn) 1979. "Algorithms for data mining." Thesis, Massachusetts Institute of Technology, 2006.

Anwar, Muhammad Naveed. "Data mining of audiology." Thesis, University of Sunderland, 2012.

Santos, José Carlos Almeida. "Mining protein structure data." Master's thesis, FCT - UNL, 2006.

Garda-Osorio, Cesar. "Data mining and visualisation." Thesis, University of the West of Scotland, 2005.

Rawles, Simon Alan. "Object-oriented data mining." Thesis, University of Bristol, 2007.

Mao, Shihong. "Comparative Microarray Data Mining." Wright State University / OhioLINK, 2007.

Novák, Petr. "Data mining časových řad." Master's thesis, Vysoká škola ekonomická v Praze, 2009.

Blunt, Gordon. "Mining credit card data." Thesis, n.p, 2002.

Niggemann, Oliver. "Visual data mining of graph based data." [S.l. : s.n.], 2001.

Li, Liangchun. "Web-based data visualization for data mining." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1998.

Al-Hashemi, Idrees Yousef. "Applying data mining techniques over big data." Thesis, Boston University, 2013.

Zhou, Wubai. "Data Mining Techniques to Understand Textual Data." FIU Digital Commons, 2017.

KAVOOSIFAR, MOHAMMAD REZA. "Data Mining and Indexing Big Multimedia Data." Doctoral thesis, Politecnico di Torino, 2019.

Adderly, Darryl M. "Data mining meets e-commerce using data mining to improve customer relationship management /." [Gainesville, Fla.]: University of Florida, 2002.

Vithal, Kadam Omkar. "Novel applications of Association Rule Mining- Data Stream Mining." AUT University, 2009.

Patel, Akash. "Data Mining of Process Data in Multivariable Systems." Thesis, KTH, Skolan för elektro- och systemteknik (EES), 2016.

Cordeiro, Robson Leonardo Ferreira. "Data mining in large sets of complex data." Universidade de São Paulo, 2011.

XIAO, XIN. "Data Mining Techniques for Complex User-Generated Data." Doctoral thesis, Politecnico di Torino, 2016.

Tong, Suk-man Ivy. "Techniques in data stream mining." Click to view the E-thesis via HKUTO, 2005.

Borgelt, Christian. "Data mining with graphical models." [S.l. : s.n.], 2000.

Weber, Irene. "Suchraumbeschränkung für relationales Data Mining." [S.l. : s.n.], 2004.

Maden, Engin. "Data Mining On Architecture Simulation." Master's thesis, METU, 2010.

Drwal, Maciej. "Data mining in distributedcomputer systems." Thesis, Blekinge Tekniska Högskola, Sektionen för datavetenskap och kommunikation, 2009.

Thun, Julia, and Rebin Kadouri. "Automating debugging through data mining." Thesis, KTH, Data- och elektroteknik, 2017.

Rahman, Sardar Muhammad Monzurur, and mrahman99@yahoo com. "Data Mining Using Neural Networks." RMIT University. Electrical & Computer Engineering, 2006.

Guo, Shishan. "Data mining in crystallographic databases." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 2000.

Sun, Wenyi. "Data mining extension for economics." Diss., Columbia, Mo. : University of Missouri-Columbia, 2006.

Papadatos, George. "Data mining for lead optimisation." Thesis, University of Sheffield, 2011.

Rice, Simon B. "Text data mining in bioinformatics." Thesis, University of Manchester, 2005.

Lin, Zhenmin. "Privacy Preserving Distributed Data Mining." UKnowledge, 2012.

Tong, Suk-man Ivy, and 湯淑敏. "Techniques in data stream mining." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2005.

Luo, Man. "Data mining and classical statistics." Virtual Press, 2004.

Cai, Zhongming. "Technical aspects of data mining." Thesis, Cardiff University, 2001.

Shioda, Romy 1977. "Integer optimization in data mining." Thesis, Massachusetts Institute of Technology, 2003.

Lo, Ya-Chin, and 羅雅琴. "Data mining in bioinformatics -- NCBI tools for data mining." Thesis, 2004.

PhD Projects in Data Mining

PhD Projects in Data Mining is ready to invent new research work that will uplift your career. We offer a hi-tech set up for PhD pupils who want to do a project in data mining. In many ways, Data Mining stands as an active research area also with plenty of uses.

‘Data Mining will involve gathering data and also finding any pattern present in there.’ Additionally, it will also aid in the dealing out of that data into useful info. Often it will imply other areas such as IoT, cloud computing, big data, and so on. Our experts will also carry out a thorough data mining project analysis.

Buy Research PhD Projects in Data Mining Online


Uci machine learning repository.

  • Hepatitis C Virus (HCV) for Egyptian Patients
  • Human Activity Recognition also from Continuous Ambient Sensor Data
  • Beijing Multi-Site Air-Quality Data
  • WISDM Smartphone and Smartwatch Activity and also in Biometrics Dataset

Most Popular

  • Breast Cancer Wisconsin also (Diagnostic)
  • Forest Fires
  • Human Activity

Insider & Intrusion Threats Dataset

  • KDD Cup 99 dataset
  • NSL KDD Dataset
  • CIDDS Dataset
  • ADFA-IDS 2017
  • UGR Dataset
  • CIC IDS Dataset
  • Contagio-CTU-UNB
  • ADFA Intrusion Detection Datasets
  • And also in University of Newbrunswick datasets

PhD Projects in Data Mining  will provide the Neophytes’ technical platform to pursue their research in a realistic manner. We will also respect any of your data mining project ideas and assure to give the utmost care.

Most Researched Data Mining Topics in Current Days

  • Graph Mining for Malware Detection
  • Data Assimilation by Neural Networks
  • Task-Oriented Pattern Mining
  • Big Data Mining
  • Cyber Security for Massive Data
  • 5G Technology
  • Software Defined Networking
  • Information Security
  • Distributed Data Mining
  • Blockchain also in Data Analytics
  • Cluster Analysis for Data Mining
  • Mining with Deep Learning

You can get the Data Mining projects code alone from our experts. All you need to do is, come and also explain your concept with input and output. Our experts will start your code in the language and tool that you stated. Without delay, you can get your code on time.

PhD Projects in Data Mining will help you heed your failures and move ahead to succeed. We also hold an incisive crew of 150+ top rate experts to aid you in any tool.


  • IBM SPSS Modeler
  • And also in Hadoop

At this point, we will finish all your project work and wrap it after corrections. Next, you will also get a project with all the add-ons. Our experts will explain all the terms in your work to clarify your doubts. Perhaps, you need the details one more time. Then, just make a call to our help desk, and we will be at your service.

Without a plan, your research is idle; blend with us to take your research to the next level!!!’

In the final analysis, go through the few newfangled ideas in Data Mining,

Uncertain Sensor Data for Trajectory Mining

Mining High-Utility Itemsets using Selective Database Projections Based Methodology

Mining Frequent Patterns using MapReduce-Based Apriori Versions on Big Data

A Real-Time Massive Data Processing Method for Densely Distributed Sensor Networks

A Novel Association Rule Mining Approach for Probabilistic Graph Model –Based Power Transformers State Parameters in big Data

A Big Data Analytics Oriented Data Engineering based on Schema Theory in Gene Expression Programming

Prediction of Hospital Admissions From the Emergency Department in Data Mining

A Method of Mining Hidden Transition of Business Process using Region

An Efficient Novel Upper-Bounds-Based Vertical Mining of High Average-Utility Itemsets

Data mining complex correlations for Islanding detection of synchronous distributed generators

An Algorithm of Weighted Frequent Itemset Mining for Intelligent Decision in Smart Systems

An Alternative Method: Estimating 3-D Large Displacements of Mining Areas from a Single SAR Amplitude Pair based on Offset Tracking

A Chronic disease progression mining using Heterogeneous network

Personalized E-Learning Model - Integration of Data Mining Clustering Techniques

Analyze Travel Time in Road-Based Mass Transit Systems using Systematic Approach in Data Mining

Privacy preserving: association rule hiding based on fuzzy logic approach for big data mining

Reducing Redundancy for Prevalent Co-Location Patterns

A Goal-oriented Requirement Analysis Method for Non-Expert Users - Data Mining Techniques Selection

Line Trip Fault Prediction using Data in Power Systems based on  LSTM Networks and SVM

A Real-Time PCA –Based Applications using Indirect Power-System Contingency Screening

PhD Projects in Data Mining

Why Work With Us ?

Senior research member, research experience, journal member, book publisher, research ethics, business ethics, valid references, explanations, paper publication, 9 big reasons to select us.

Our Editor-in-Chief has Website Ownership who control and deliver all aspects of PhD Direction to scholars and students and also keep the look to fully manage all our clients.

Our world-class certified experts have 18+years of experience in Research & Development programs (Industrial Research) who absolutely immersed as many scholars as possible in developing strong PhD research projects.

We associated with 200+reputed SCI and SCOPUS indexed journals (SJR ranking) for getting research work to be published in standard journals (Your first-choice journal). is world’s largest book publishing platform that predominantly work subject-wise categories for scholars/students to assist their books writing and takes out into the University Library.

Our researchers provide required research ethics such as Confidentiality & Privacy, Novelty (valuable research), Plagiarism-Free, and Timely Delivery. Our customers have freedom to examine their current specific research activities.

Our organization take into consideration of customer satisfaction, online, offline support and professional works deliver since these are the actual inspiring business factors.

Solid works delivering by young qualified global research team. "References" is the key to evaluating works easier because we carefully assess scholars findings.

Detailed Videos, Readme files, Screenshots are provided for all research projects. We provide Teamviewer support and other online channels for project explanation.

Worthy journal publication is our main thing like IEEE, ACM, Springer, IET, Elsevier, etc. We substantially reduces scholars burden in publication side. We carry scholars from initial submission to final acceptance.

Related Pages

Phd Research Topics In Text Mining

Phd Research Topics In Web Mining

Phd Research Topics In Image Mining

Phd Research Topics In Opnet

Phd Research Topics In Web Technology

Phd Research Topics In Rtool

Phd Research Topics In Webservice

Phd Research Topics In Scilab

Phd Research Topics In Weka

Phd Research Topics In Routing

Phd Research Topics In Wordnet

Phd Research Topics In Router

Phd Research Topics In Rpl

Phd Research Topics In Opencv

Phd Research Topics In Information Forensics Security

Our Benefits

Throughout reference, confidential agreement, research no way resale, plagiarism-free, publication guarantee, customize support, fair revisions, business professionalism, domains & tools, we generally use, wireless communication (4g lte, and 5g), ad hoc networks (vanet, manet, etc.), wireless sensor networks, software defined networks, network security, internet of things (mqtt, coap), internet of vehicles, cloud computing, fog computing, edge computing, mobile computing, mobile cloud computing, ubiquitous computing, digital image processing, medical image processing, pattern analysis and machine intelligence, geoscience and remote sensing, big data analytics, data mining, power electronics, web of things, digital forensics, natural language processing, automation systems, artificial intelligence, mininet 2.1.0, matlab (r2018b/r2019a), matlab and simulink, apache hadoop, apache spark mlib, apache mahout, apache flink, apache storm, apache cassandra, pig and hive, rapid miner, support 24/7, call us @ any time, +91 9444829042, [email protected].

Questions ?

Click here to chat with us

M.Tech/Ph.D Thesis Help in Chandigarh | Thesis Guidance in Chandigarh

data mining thesis phd

[email protected]

data mining thesis phd


Data Mining

data mining thesis phd

  • Data Analytics and Machine Learning Group
  • TUM School of Computation, Information and Technology
  • Technical University of Munich

Technical University of Munich

Open Topics

We offer multiple Bachelor/Master theses, Guided Research projects and IDPs in the area of data mining/machine learning. A  non-exhaustive list of open topics is listed below.

If you are interested in a thesis or a guided research project, please send your CV and transcript of records to Prof. Stephan Günnemann via email and we will arrange a meeting to talk about the potential topics.

Robustness of Large Language Models

Type: Master's Thesis


  • Strong knowledge in machine learning
  • Very good coding skills
  • Proficiency with Python and deep learning frameworks (TensorFlow or PyTorch)
  • Knowledge about NLP and LLMs


The success of Large Language Models (LLMs) has precipitated their deployment across a diverse range of applications. With the integration of plugins enhancing their capabilities, it becomes imperative to ensure that the governing rules of these LLMs are foolproof and immune to circumvention. Recent studies have exposed significant vulnerabilities inherent to these models, underlining an urgent need for more rigorous research to fortify their resilience and reliability. A focus in this work will be the understanding of the working mechanisms of these attacks.

We are currently seeking students for the upcoming Summer Semester of 2024, so we welcome prompt applications. 

Contact: Tom Wollschläger


  • Universal and Transferable Adversarial Attacks on Aligned Language Models
  • Attacking Large Language Models with Projected Gradient Descent
  • Representation Engineering: A Top-Down Approach to AI Transparency
  • Mechanistically analyzing the effects of fine-tuning on procedurally defined tasks

Generative Models for Drug Discovery

Type:  Mater Thesis / Guided Research

  • Strong machine learning knowledge
  • Proficiency with Python and deep learning frameworks (PyTorch or TensorFlow)
  • Knowledge of graph neural networks (e.g. GCN, MPNN)
  • No formal education in chemistry, physics or biology needed!

Effectively designing molecular geometries is essential to advancing pharmaceutical innovations, a domain which has experienced great attention through the success of generative models. These models promise a more efficient exploration of the vast chemical space and generation of novel compounds with specific properties by leveraging their learned representations, potentially leading to the discovery of molecules with unique properties that would otherwise go undiscovered. Our topics lie at the intersection of generative models like diffusion/flow matching models and graph representation learning, e.g., graph neural networks. The focus of our projects can be model development with an emphasis on downstream tasks ( e.g., diffusion guidance at inference time ) and a better understanding of the limitations of existing models.

Contact :  Johanna Sommer , Leon Hetzel

Equivariant Diffusion for Molecule Generation in 3D

Equivariant Flow Matching with Hybrid Probability Transport for 3D Molecule Generation

Structure-based Drug Design with Equivariant Diffusion Models

Data Pruning and Active Learning

Type: Interdisciplinary Project (IDP) / Hiwi / Guided Research / Master's Thesis

Data pruning and active learning are vital techniques in scaling machine learning applications efficiently. Data pruning involves the removal of redundant or irrelevant data, which enables training models with considerably less data but the same performance. Similarly, active learning describes the process of selecting the most informative data points for labeling, thus reducing annotation costs and accelerating model training. However, current methods are often computationally expensive, which makes them difficult to apply in practice. Our objective is to scale active learning and data pruning methods to large datasets using an extrapolation-based approach.

Contact: Sebastian Schmidt , Tom Wollschläger , Leo Schwinn

  • Large-scale Dataset Pruning with Dynamic Uncertainty

Efficient Machine Learning: Pruning, Quantization, Distillation, and More - DAML x Pruna AI

Type: Master's Thesis / Guided Research / Hiwi

The efficiency of machine learning algorithms is commonly evaluated by looking at target performance, speed and memory footprint metrics. Reduce the costs associated to these metrics is of primary importance for real-world applications with limited ressources (e.g. embedded systems, real-time predictions). In this project, you will work in collaboration with the DAML research group and the Pruna AI startup on investigating solutions to improve the efficiency of machine leanring models by looking at multiple techniques like pruning, quantization, distillation, and more.

Contact: Bertrand Charpentier

  • The Efficiency Misnomer
  • A Gradient Flow Framework for Analyzing Network Pruning
  • Distilling the Knowledge in a Neural Network
  • A Survey of Quantization Methods for Efficient Neural Network Inference

Deep Generative Models

Type:  Master Thesis / Guided Research

  • Strong machine learning and probability theory knowledge
  • Knowledge of generative models and their basics (e.g., Normalizing Flows, Diffusion Models, VAE)
  • Optional: Neural ODEs/SDEs, Optimal Transport, Measure Theory

With recent advances, such as Diffusion Models, Transformers, Normalizing Flows, Flow Matching, etc., the field of generative models has gained significant attention in the machine learning and artificial intelligence research community. However, many problems and questions remain open, and the application to complex data domains such as graphs, time series, point processes, and sets is often non-trivial. We are interested in supervising motivated students to explore and extend the capabilities of state-of-the-art generative models for various data domains.

Contact : Marcel Kollovieh , David Lüdke

  • Flow Matching for Generative Modeling
  • Auto-Encoding Variational Bayes
  • Denoising Diffusion Probabilistic Models 
  • Structured Denoising Diffusion Models in Discrete State-Spaces

Graph Structure Learning

Type:  Guided Research / Hiwi

  • Optional: Knowledge of graph theory and mathematical optimization

Graph deep learning is a powerful ML concept that enables the generalisation of successful deep neural architectures to non-Euclidean structured data. Such methods have shown promising results in a vast range of applications spanning the social sciences, biomedicine, particle physics, computer vision, graphics and chemistry. One of the major limitations of most current graph neural network architectures is that they often rely on the assumption that the underlying graph is known and fixed. However, this assumption is not always true, as the graph may be noisy or partially and even completely unknown. In the case of noisy or partially available graphs, it would be useful to jointly learn an optimised graph structure and the corresponding graph representations for the downstream task. On the other hand, when the graph is completely absent, it would be useful to infer it directly from the data. This is particularly interesting in inductive settings where some of the nodes were not present at training time. Furthermore, learning a graph can become an end in itself, as the inferred structure can provide complementary insights with respect to the downstream task. In this project, we aim to investigate solutions and devise new methods to construct an optimal graph structure based on the available (unstructured) data.

Contact : Filippo Guerranti

  • A Survey on Graph Structure Learning: Progress and Opportunities
  • Differentiable Graph Module (DGM) for Graph Convolutional Networks
  • Learning Discrete Structures for Graph Neural Networks

NodeFormer: A Scalable Graph Structure Learning Transformer for Node Classification

A Machine Learning Perspective on Corner Cases in Autonomous Driving Perception  

Type: Master's Thesis 

Industrial partner: BMW 


  • Strong knowledge in machine learning 
  • Knowledge of Semantic Segmentation  
  • Good programming skills 
  • Proficiency with Python and deep learning frameworks (TensorFlow or PyTorch) 


In autonomous driving, state-of-the-art deep neural networks are used for perception tasks like for example semantic segmentation. While the environment in datasets is controlled in real world application novel class or unknown disturbances can occur. To provide safe autonomous driving these cased must be identified. 

The objective is to explore novel class segmentation and out of distribution approaches for semantic segmentation in the context of corner cases for autonomous driving. 

Contact: Sebastian Schmidt


  • Segmenting Known Objects and Unseen Unknowns without Prior Knowledge 
  • Efficient Uncertainty Estimation for Semantic Segmentation in Videos  
  • Natural Posterior Network: Deep Bayesian Uncertainty for Exponential Family  
  • Description of Corner Cases in Automated Driving: Goals and Challenges 

Active Learning for Multi Agent 3D Object Detection 

Type: Master's Thesis  Industrial partner: BMW 

  • Knowledge in Object Detection 
  • Excellent programming skills 

In autonomous driving, state-of-the-art deep neural networks are used for perception tasks like for example 3D object detection. To provide promising results, these networks often require a lot of complex annotation data for training. These annotations are often costly and redundant. Active learning is used to select the most informative samples for annotation and cover a dataset with as less annotated data as possible.   

The objective is to explore active learning approaches for 3D object detection using combined uncertainty and diversity based methods.  

  • Exploring Diversity-based Active Learning for 3D Object Detection in Autonomous Driving   
  • Efficient Uncertainty Estimation for Semantic Segmentation in Videos   
  • KECOR: Kernel Coding Rate Maximization for Active 3D Object Detection
  • Towards Open World Active Learning for 3D Object Detection   

Graph Neural Networks

Type:  Master's thesis / Bachelor's thesis / guided research

  • Knowledge of graph/network theory

Graph neural networks (GNNs) have recently achieved great successes in a wide variety of applications, such as chemistry, reinforcement learning, knowledge graphs, traffic networks, or computer vision. These models leverage graph data by updating node representations based on messages passed between nodes connected by edges, or by transforming node representation using spectral graph properties. These approaches are very effective, but many theoretical aspects of these models remain unclear and there are many possible extensions to improve GNNs and go beyond the nodes' direct neighbors and simple message aggregation.

Contact: Simon Geisler

  • Semi-supervised classification with graph convolutional networks
  • Relational inductive biases, deep learning, and graph networks
  • Diffusion Improves Graph Learning
  • Weisfeiler and leman go neural: Higher-order graph neural networks
  • Reliable Graph Neural Networks via Robust Aggregation

Physics-aware Graph Neural Networks

Type:  Master's thesis / guided research

  • Proficiency with Python and deep learning frameworks (JAX or PyTorch)
  • Knowledge of graph neural networks (e.g. GCN, MPNN, SchNet)
  • Optional: Knowledge of machine learning on molecules and quantum chemistry

Deep learning models, especially graph neural networks (GNNs), have recently achieved great successes in predicting quantum mechanical properties of molecules. There is a vast amount of applications for these models, such as finding the best method of chemical synthesis or selecting candidates for drugs, construction materials, batteries, or solar cells. However, GNNs have only been proposed in recent years and there remain many open questions about how to best represent and leverage quantum mechanical properties and methods.

Contact: Nicholas Gao

  • Directional Message Passing for Molecular Graphs
  • Neural message passing for quantum chemistry
  • Learning to Simulate Complex Physics with Graph Network
  • Ab initio solution of the many-electron Schrödinger equation with deep neural networks
  • Ab-Initio Potential Energy Surfaces by Pairing GNNs with Neural Wave Functions
  • Tensor field networks: Rotation- and translation-equivariant neural networks for 3D point clouds

Robustness Verification for Deep Classifiers

Type: Master's thesis / Guided research

  • Strong machine learning knowledge (at least equivalent to IN2064 plus an advanced course on deep learning)
  • Strong background in mathematical optimization (preferably combined with Machine Learning setting)
  • Proficiency with python and deep learning frameworks (Pytorch or Tensorflow)
  • (Preferred) Knowledge of training techniques to obtain classifiers that are robust against small perturbations in data

Description : Recent work shows that deep classifiers suffer under presence of adversarial examples: misclassified points that are very close to the training samples or even visually indistinguishable from them. This undesired behaviour constraints possibilities of deployment in safety critical scenarios for promising classification methods based on neural nets. Therefore, new training methods should be proposed that promote (or preferably ensure) robust behaviour of the classifier around training samples.

Contact: Aleksei Kuvshinov

References (Background):

  • Intriguing properties of neural networks
  • Explaining and harnessing adversarial examples
  • SoK: Certified Robustness for Deep Neural Networks
  • Certified Adversarial Robustness via Randomized Smoothing
  • Formal guarantees on the robustness of a classifier against adversarial manipulation
  • Towards deep learning models resistant to adversarial attacks
  • Provable defenses against adversarial examples via the convex outer adversarial polytope
  • Certified defenses against adversarial examples
  • Lipschitz-margin training: Scalable certification of perturbation invariance for deep neural networks

Uncertainty Estimation in Deep Learning

Type: Master's Thesis / Guided Research

  • Strong knowledge in probability theory

Safe prediction is a key feature in many intelligent systems. Classically, Machine Learning models compute output predictions regardless of the underlying uncertainty of the encountered situations. In contrast, aleatoric and epistemic uncertainty bring knowledge about undecidable and uncommon situations. The uncertainty view can be a substantial help to detect and explain unsafe predictions, and therefore make ML systems more robust. The goal of this project is to improve the uncertainty estimation in ML models in various types of task.

Contact: Tom Wollschläger ,   Dominik Fuchsgruber ,   Bertrand Charpentier

  • Can You Trust Your Model’s Uncertainty? Evaluating Predictive Uncertainty Under Dataset Shift
  • Predictive Uncertainty Estimation via Prior Networks
  • Posterior Network: Uncertainty Estimation without OOD samples via Density-based Pseudo-Counts
  • Evidential Deep Learning to Quantify Classification Uncertainty
  • Weight Uncertainty in Neural Networks

Hierarchies in Deep Learning

Type:  Master's Thesis / Guided Research

Multi-scale structures are ubiquitous in real life datasets. As an example, phylogenetic nomenclature naturally reveals a hierarchical classification of species based on their historical evolutions. Learning multi-scale structures can help to exhibit natural and meaningful organizations in the data and also to obtain compact data representation. The goal of this project is to leverage multi-scale structures to improve speed, performances and understanding of Deep Learning models.

Contact: Marcel Kollovieh , Bertrand Charpentier

  • Tree Sampling Divergence: An Information-Theoretic Metricfor Hierarchical Graph Clustering
  • Hierarchical Graph Representation Learning with Differentiable Pooling
  • Gradient-based Hierarchical Clustering
  • Gradient-based Hierarchical Clustering using Continuous Representations of Trees in Hyperbolic Space

Eindhoven University of Technology research portal Logo

  • Help & FAQ

Data Mining

  • Data Science
  • Data and Artificial Intelligence

Student theses

  • 1 - 50 out of 258 results
  • Title (descending)

Search results

3d face reconstruction using deep learning.

Supervisor: Medeiros de Carvalho, R. (Supervisor 1), Gallucci, A. (Supervisor 2) & Vanschoren, J. (Supervisor 2)

Student thesis : Master

Achieving Long Term Fairness through Curiosity Driven Reinforcement Learning: How intrinsic motivation influences fairness in algorithmic decision making

Supervisor: Pechenizkiy, M. (Supervisor 1), Gajane, P. (Supervisor 2) & Kapodistria, S. (Supervisor 2)

Activity Recognition Using Deep Learning in Videos under Clinical Setting

Supervisor: Duivesteijn, W. (Supervisor 1), Papapetrou, O. (Supervisor 2), Zhang, L. (External person) (External coach) & Vasu, J. D. (External coach)

A Data Cleaning Assistant

Supervisor: Vanschoren, J. (Supervisor 1)

Student thesis : Bachelor

A Data Cleaning Assistant for Machine Learning

A deep learning approach for clustering a multi-class dataset.

Supervisor: Pei, Y. (Supervisor 1), Marczak, M. (External person) (External coach) & Groen, J. (External person) (External coach)

Aerial Imagery Pixel-level Segmentation

A framework for understanding business process remaining time predictions.

Supervisor: Pechenizkiy, M. (Supervisor 1) & Scheepens, R. J. (Supervisor 2)

A Hybrid Model for Pedestrian Motion Prediction

Supervisor: Pechenizkiy, M. (Supervisor 1), Muñoz Sánchez, M. (Supervisor 2), Silvas, E. (External coach) & Smit, R. M. B. (External coach)

Algorithms for center-based trajectory clustering

Supervisor: Buchin, K. (Supervisor 1) & Driemel, A. (Supervisor 2)

Allocation Decision-Making in Service Supply Chain with Deep Reinforcement Learning

Supervisor: Zhang, Y. (Supervisor 1), van Jaarsveld, W. L. (Supervisor 2), Menkovski, V. (Supervisor 2) & Lamghari-Idrissi, D. (Supervisor 2)

Analyzing Policy Gradient approaches towards Rapid Policy Transfer

An empirical study on dynamic curriculum learning in information retrieval.

Supervisor: Fang, M. (Supervisor 1)

An Explainable Approach to Multi-contextual Fake News Detection

Supervisor: Pechenizkiy, M. (Supervisor 1), Pei, Y. (Supervisor 2) & Das, B. (External person) (External coach)

An exploration and evaluation of concept based interpretability methods as a measure of representation quality in neural networks

Supervisor: Menkovski, V. (Supervisor 1) & Stolikj, M. (External coach)

Anomaly detection in image data sets using disentangled representations

Supervisor: Menkovski, V. (Supervisor 1) & Tonnaer, L. M. A. (Supervisor 2)

Anomaly Detection in Polysomnography signals using AI

Supervisor: Pechenizkiy, M. (Supervisor 1), Schwanz Dias, S. (Supervisor 2) & Belur Nagaraj, S. (External person) (External coach)

Anomaly detection in text data using deep generative models

Supervisor: Menkovski, V. (Supervisor 1) & van Ipenburg, W. (External person) (External coach)

Anomaly Detection on Dynamic Graph

Supervisor: Pei, Y. (Supervisor 1), Fang, M. (Supervisor 2) & Monemizadeh, M. (Supervisor 2)

Anomaly Detection on Finite Multivariate Time Series from Semi-Automated Screwing Applications

Supervisor: Pechenizkiy, M. (Supervisor 1) & Schwanz Dias, S. (Supervisor 2)

Anomaly Detection on Multivariate Time Series Using GANs

Supervisor: Pei, Y. (Supervisor 1) & Kruizinga, P. (External person) (External coach)

Anomaly detection on vibration data

Supervisor: Hess, S. (Supervisor 1), Pechenizkiy, M. (Supervisor 2), Yakovets, N. (Supervisor 2) & Uusitalo, J. (External person) (External coach)

Application of P&ID symbol detection and classification for generation of material take-off documents (MTOs)

Supervisor: Pechenizkiy, M. (Supervisor 1), Banotra, R. (External person) (External coach) & Ya-alimadad, M. (External person) (External coach)

Applications of deep generative models to Tokamak Nuclear Fusion

Supervisor: Koelman, J. M. V. A. (Supervisor 1), Menkovski, V. (Supervisor 2), Citrin, J. (Supervisor 2) & van de Plassche, K. L. (External coach)

A Similarity Based Meta-Learning Approach to Building Pipeline Portfolios for Automated Machine Learning

Aspect-based few-shot learning.

Supervisor: Menkovski, V. (Supervisor 1)

Assessing Bias and Fairness in Machine Learning through a Causal Lens

Supervisor: Pechenizkiy, M. (Supervisor 1)

Assessing fairness in anomaly detection: A framework for developing a context-aware fairness tool to assess rule-based models

Supervisor: Pechenizkiy, M. (Supervisor 1), Weerts, H. J. P. (Supervisor 2), van Ipenburg, W. (External person) (External coach) & Veldsink, J. W. (External person) (External coach)

A Study of an Open-Ended Strategy for Learning Complex Locomotion Skills

A systematic determination of metrics for classification tasks in openml, a universally applicable emm framework.

Supervisor: Duivesteijn, W. (Supervisor 1), van Dongen, B. F. (Supervisor 2) & Yakovets, N. (Supervisor 2)

Automated machine learning with gradient boosting and meta-learning

Automated object recognition of solar panels in aerial photographs: a case study in the liander service area.

Supervisor: Pechenizkiy, M. (Supervisor 1), Medeiros de Carvalho, R. (Supervisor 2) & Weelinck, T. (External person) (External coach)

Automatic data cleaning

Automatic scoring of short open-ended questions.

Supervisor: Pechenizkiy, M. (Supervisor 1) & van Gils, S. (External coach)

Automatic Synthesis of Machine Learning Pipelines consisting of Pre-Trained Models for Multimodal Data

Automating string encoding in automl, autoregressive neural networks to model electroencephalograpy signals.

Supervisor: Vanschoren, J. (Supervisor 1), Pfundtner, S. (External person) (External coach) & Radha, M. (External coach)

Balancing Efficiency and Fairness on Ride-Hailing Platforms via Reinforcement Learning

Supervisor: Tavakol, M. (Supervisor 1), Pechenizkiy, M. (Supervisor 2) & Boon, M. A. A. (Supervisor 2)

Benchmarking Audio DeepFake Detection

Better clustering evaluation for the openml evaluation engine.

Supervisor: Vanschoren, J. (Supervisor 1), Gijsbers, P. (Supervisor 2) & Singh, P. (Supervisor 2)

Bi-level pipeline optimization for scalable AutoML

Supervisor: Nobile, M. (Supervisor 1), Vanschoren, J. (Supervisor 1), Medeiros de Carvalho, R. (Supervisor 2) & Bliek, L. (Supervisor 2)

Block-sparse evolutionary training using weight momentum evolution: training methods for hardware efficient sparse neural networks

Supervisor: Mocanu, D. (Supervisor 1), Zhang, Y. (Supervisor 2) & Lowet, D. J. C. (External coach)

Boolean Matrix Factorization and Completion

Supervisor: Peharz, R. (Supervisor 1) & Hess, S. (Supervisor 2)

Bootstrap Hypothesis Tests for Evaluating Subgroup Descriptions in Exceptional Model Mining

Supervisor: Duivesteijn, W. (Supervisor 1) & Schouten, R. M. (Supervisor 2)

Bottom-Up Search: A Distance-Based Search Strategy for Supervised Local Pattern Mining on Multi-Dimensional Target Spaces

Supervisor: Duivesteijn, W. (Supervisor 1), Serebrenik, A. (Supervisor 2) & Kromwijk, T. J. (Supervisor 2)

Bridging the Domain-Gap in Computer Vision Tasks

Supervisor: Mocanu, D. C. (Supervisor 1) & Lowet, D. J. C. (External coach)

CCESO: Auditing AI Fairness By Comparing Counterfactual Explanations of Similar Objects

Supervisor: Pechenizkiy, M. (Supervisor 1) & Hoogland, K. (External person) (External coach)

Clean-Label Poison Attacks on Machine Learning

Supervisor: Michiels, W. P. A. J. (Supervisor 1), Schalij, F. D. (External coach) & Hess, S. (Supervisor 2)


PhD Thesis Topics in Data Mining

     PhD Thesis Topics in Data Mining offer you innovative idea to build your career even stronger in research. Our world-class data analysts frequently updated new innovative ideas for research scholars and students. Till now, more than 7000+ research scholars profited from our innovative research guidance. All around the world, we have 120 + well-equipped branches focused on giving innovative ideas also to research scholars.

A huge number of workshop programmes is conducted and also explored our innovative thoughts in the globe. We are fast-growing research organization among other research organizations. We have also published nearly 100+ world class journals, which help us attain a high degree of value globally.

Thesis Topics in Data Mining

     PhD Thesis Topics in Data Mining presents beneficial information about your data mining research area. We also offer guidance support online and offline also for your convenience.  Data mining is the process of discovering patterns and provides necessary information from the large scale dataset We also often provide a vast number of innovative ideas based on current trends to research scholars and students online and offline. Once you approach us, you definitely feel our amazing work in research.

We also have developed data mining applications in Science, Engineering, Healthcare, and Medicine. Our research analysts also have 10+ years of experience mentioned above area of research.  Let’s view below some of the recent highlights and algorithms with the method in data mining; those are as follows,

Recent Highlights in Data-Mining

  • Data Mining Tools and Software
  • Big Data Algorithm
  • Big-Data Applications
  • Data Warehousing
  • Artificial Intelligence
  • Data Privacy and also Ethics
  • Cloud Computing
  • OLAP Technologies
  • New Visualization techniques
  • Kernel Methods
  • Data mining and also Search
  • ETL(Extract, Transform and also Load)
  • Forecasting from Big Data
  • Big Data and also Optimization
  • Algorithms and also Complexity
  • Social network analysis
  • Business Analytics

Algorithms and Methods

  • Frequent Pattern Mining
  • Mining also with Constraints
  • Pre-processing & Data Cleaning
  • Mining on Emerging Architectures
  • Multi-Task Learning
  • Online Algorithms
  • High-Performance & also Big Data, Scalable Computing Techniques
  • Dimension Reduction and also Feature Extraction, Selection
  • Mining also with Data Clouds
  • Mining-Semi Structured Data
  • Mining Complex Datasets
  • Web & also Text Mining
  • Optimization Methods
  • Other Novel Methods
  • Classification
  • Statistical &  Methods
  • Graphical Models
  • Computational Learning Theory
  • Temporal & Spatial Mining
  • Data Stream Mining
  • Anomaly & also Outlier Detection
  • Mining Graphs

Recent Real Time Applications

  • Drug Discovery
  • Healthcare Management
  • Process Control & Automation
  • High Energy Physics
  • Logistics Management
  • Supply Chain Management
  • Intelligence Analysis
  • Collaborative Filtering
  • Risk Management
  • Environmental  Science / Ecological / Climate
  • Fraud detection & Intrusion
  • Bio-surveillance
  • Sensor Network Applications
  • Social and Information Network Analyses
  • Educational Data Mining
  • Other Novel Applications & also Case Studies
  • Astrophysics & also Astronomy
  • Customer Relationship Management
  • Bioinformatics & Genomics

      We also have highly experienced professionals those who analysed in the area of social issues and human factors in data mining. Let’s see some of the social issues and also human factors that are as follows,

Social Issues and Human Factors

  • User Interfaces
  • Data Publishing & Privacy Preserving Data Mining
  • Ethics of Data Mining
  • Privacy Models
  • Relevance & Interestingness
  • Result & Data Visualization
  • Social Issues and also Other Human Factors
  • Intellectual Ownership
  • Risk Analysis

Recent Research Topics

  • Big Data in Military Applications also with Big Gains Foreseen
  • Social intelligence-Opinion data mining also in Bigdata
  • Mine planning also with Maptek unveils latest solutions in data mining
  • Data mining their private records via headphone app
  • Data Mining and Analytics also using Sigma Systems Releases its Latest Product Innovation
  • Identification of logical dependencies method also in data mining
  • Ski resort lift usage data applied also on mining skier transportation patterns
  • Data mining techniques and also Jackson’s learning styles with usage of adaptive e-learning web-based English tutor
  • Correlation research also with usage of improving privacy preserving methods in data mining
  • Web insights and data analytics with the usage of data mining and also analytics
  • Customer classification and risk analysis in health sector with integrated customer relationship management and also data mining framework
  • Educational data mining also in engineering education applications comparison
  • LMS for personalized education also using analysis of data mining techniques

        We hope that the aforementioned key information is enough to get clear idea about Thesis Topics in Data Mining. If you want more information, you can feel free to approach us for best Topics in Data Mining is always new, innovative, creative and novel ideas. For more information about our list of projects, services, offers, please make a call at any time. We are here for you and your attention is very important us.

Success is getting what you want…

Happiness is wanting what you get……,     keep in touch with us, shine with flying color…, related pages, services we offer.

Mathematical proof

Pseudo code

Conference Paper

Research Proposal

System Design

Literature Survey

Data Collection

Thesis Writing

Data Analysis

Rough Draft

Paper Collection

Code and Programs

Paper Writing

Course Work

The Research Repository @ WVU

Home > Statler College of Engineering and Mineral Resources > MININGENG > Mining Engineering Graduate Theses and Dissertations

Mining Engineering Graduate Theses and Dissertations

Theses/dissertations from 2023 2023.

Development of A Hydrometallurgical Process for the Extraction of Cobalt, Manganese, and Nickel from Acid Mine Drainage Treatment Byproduct , Alejandro Agudelo Mira

Selective Recovery of Rare Earth Elements from Acid Mine Drainage Treatment Byproduct , Zeynep Cicek

Identification of Rockmass Deformation and Lithological Changes in Underground Mines by Using Slam-Based Lidar Technology , Francisco Eduardo Gil Hurtado

Analysis of the Brittle Failure Mechanism of Underground Stone Mine Pillars by Implementing Numerical Modeling in FLAC3D , Rosbel Jimenez

Analysis of the root causes of fatal injuries in the United States surface mines between 2008 and 2021. , Maria Fernanda Quintero


Theses/Dissertations from 2022 2022

Integrated Large Discontinuity Factor, Lamodel and Stability Mapping Approach for Stone Mine Pillar Stability , Mustafa Baris Ates

Noise Exposure Trends Among Violating Coal Mines, 2000 to 2021 , Hanna Grace Davis

Calcite depression in bastnaesite-calcite flotation system using organic acids , Emmy Muhoza

Investigation of Geomechanical Behavior of Laminated Rock Mass Through Experimental and Numerical Approach , Qingwen Shi

Static Liquefaction in Tailing Dams , Jose Raul Zela Concha

Experimental and Theoretical Investigation on the Initiation Mechanism of Low-Rank Coal's Self-Heating Process , Yinan Zhang

Development of an Entry-Scale Modeling Methodology to Provide Ground Reaction Curves for Longwall Gateroad Support Evaluation , Haochen Zhao

Size effect and anisotropy on the strength of shale under compressive stress conditions , Yun Zhao

Theses/Dissertations from 2021 2021

Evaluation of LIDAR systems for rock mass discontinuity identification in underground stone mines from 3D point cloud data , Mario Alejandro Bendezu de la Cruz

Implementing the Empirical Stone Mine Pillar Strength Equation into the Boundary Element Method Software LaModel , Samuel Escobar

Recovery of Phosphorus from Florida Phosphatic Waste Clay , Amir Eskanlou

Optimization of Operating Conditions and Design Parameters on Coal Ultra-Fine Grinding Through Kinetic Stirred Mill Tests and Numerical Modeling , Francisco Patino

The Effect of Natural Fractures on the Mechanical Behavior of Limestone Pillars: A Synthetic Rock Mass Approach Application , Mustafa Can Süner

Evaluation of Various Separation Techniques for the Removal of Actinides from A Rare Earth-Containing Solution Generated from Coarse Coal Refuse , Deniz Talan

Geology Oriented Loading Approach for Underground Coal Mines , Deniz Tuncay

Various Operational Aspects of the Extraction of Critical Minerals from Acid Mine Drainage and Its Treatment By-product , Zhongqing Xiao

Theses/Dissertations from 2020 2020

Adaptation of Coal Mine Floor Rating (CMFR) to Eastern U.S. Coal Mines , Sena Cicek

Upstream Tailings Dam - Liquefaction , Mladen Dragic

Development, Analysis and Case Studies of Impact Resistant Steel Sets for Underground Roof Fall Rehabilitation , Dakota D. Faulkner

The influence of spatial variance on rock strength and mechanism of failure , Danqing Gao

Fundamental Studies on the Recovery of Rare Earth Elements from Acid Mine Drainage , Xue Huang

Rational drilling control parameters to reduce respirable dust during roof bolting operations , Hua Jiang

Solutions to Some Mine Subsidence Research Challenges , Jian Yang

An Interactive Mobile Equipment Task-Training with Virtual Reality , Lazar Zujovic

Theses/Dissertations from 2019 2019

Fundamental Mechanism of Time Dependent Failure in Shale , Neel Gupta

A Critical Assessment on the Resources and Extraction of Rare Earth Elements from Acid Mine Drainage , Christopher R. Vass

Time-dependent deformation and associated failure of roof in underground mines , Yuting Xue

Theses/Dissertations from 2018 2018

Parametric Study of Coal Liberation Behavior Using Silica Grinding Media , Adewale Wasiu Adeniji

Three-dimensional Numerical Modeling Encompassing the Stability of a Vertical Gas Well Subjected to Longwall Mining Operation - A Case Study , Bonaventura Alves Mangu Bali

Shale Characterization and Size-effect study using Scanning Electron Microscopy and X-Ray Diffraction , Debashis Das

Behaviour Of Laminated Roof Under High Horizontal Stress , Prasoon Garg

Theses/Dissertations from 2017 2017

Optimization of Mineral Processing Circuit Design under Uncertainty , Seyed Hassan Amini

Evaluation of Ultrasonic Velocity Tests to Characterize Extraterrestrial Rock Masses , Thomas W. Edge II

A Photogrammetry Program for Physical Modeling of Subsurface Subsidence Process , Yujia Lian

An Area-Based Calculation of the Analysis of Roof Bolt Systems (ARBS) , Aanand Nandula

Developing and implementing new algorithms into the LaModel program for numerical analysis of multiple seam interactions , Mehdi Rajaeebaygi

Adapting Roof Support Methods for Anchoring Satellites on Asteroids , Grant B. Speer

Simulation of Venturi Tube Design for Column Flotation Using Computational Fluid Dynamics , Wan Wang

Theses/Dissertations from 2016 2016

Critical Analysis of Longwall Ventilation Systems and Removal of Methane , Robert B. Krog

Implementing the Local Mine Stiffness Calculation in LaModel , Kaifang Li

Development of Emission Factors (EFs) Model for Coal Train Loading Operations , Bisleshana Brahma Prakash

Nondestructive Methods to Characterize Rock Mechanical Properties at Low-Temperature: Applications for Asteroid Capture Technologies , Kara A. Savage

Mineral Asset Valuation Under Economic Uncertainty: A Complex System for Operational Flexibility , Marcell B. B. Silveira

A Feasibility Study for the Automated Monitoring and Control of Mine Water Discharges , Christopher R. Vass

Spontaneous Combustion of South American Coal , Brunno C. C. Vieira

Calibrating LaModel for Subsidence , Jian Yang

Theses/Dissertations from 2015 2015

Coal Quality Management Model for a Dome Storage (DS-CQMM) , Manuel Alejandro Badani Prado

Design Programs for Highwall Mining Operations , Ming Fan

Development of Drilling Control Technology to Reduce Drilling Noise during Roof Bolting Operations , Mingming Li

The Online LaModel User's & Training Manual Development & Testing , Christopher R. Newman

How to mitigate coal mine bumps through understanding the violent failure of coal specimens , Gamal Rashed

Theses/Dissertations from 2014 2014

Effect of biaxial and triaxial stresses on coal mine shale rocks , Shrey Arora

Stability Analysis of Bleeder Entries in Underground Coal Mines Using the Displacement-Discontinuity and Finite-Difference Programs , Xu Tang

Experimental and Theoretical Studies of Kinetics and Quality Parameters to Determine Spontaneous Combustion Propensity of U.S. Coals , Xinyang Wang

Bubble Size Effects in Coal Flotation and Phosphate Reverse Flotation using a Pico-nano Bubble Generator , Yu Xiong

Integrating the LaModel and ARMPS Programs (ARMPS-LAM) , Peng Zhang

Theses/Dissertations from 2013 2013

Column Flotation of Subbituminous Coal Using the Blend of Trimethyl Pentanediol Derivatives and Pico-Nano Bubbles , Jinxiang Chen

Applications of Surface and Subsurface Subsidence Theories to Solve Ground Control Problems , Biao Qiu

Calibrating the LaModel Program for Shallow Cover Multiple-Seam Mines , Morgan M. Sears

The Integration of a Coal Mine Emergency Communication Network into Pre-Mine Planning and Development , Mark F. Sindelar

Factors considered for increasing longwall panel width , Jack D. Trackemas

An experimental investigation of the creep behavior of an underground coalmine roof with shale formation , Priyesh Verma

Evaluation of Rope Shovel Operators in Surface Coal Mining Using a Multi-Attribute Decision-Making Model , Ivana M. Vukotic

Theses/Dissertations from 2012 2012

Calculating the Surface Seismic Signal from a Trapped Miner , Adeniyi A. Adebisi

Comprehensive and Integrated Model for Atmospheric Status in Sealed Underground Mine Areas , Jianwei Cheng

Production and Cost Assessment of a Potential Application of Surface Miners in Coal Mining in West Virginia , Timothy A. Nolan

The Integration of Geomorphic Design into West Virginia Surface Mine Reclamation , Alison E. Sears

Truck Cycle and Delay Automated Data Collection System (TCD-ADCS) for Surface Coal Mining , Patricio G. Terrazas Prado

New Abutment Angle Concept for Underground Coal Mining , Ihsan Berk Tulu

Theses/Dissertations from 2011 2011

Experimental analysis of the post-failure behavior of coal and rock under laboratory compression tests , Dachao Neil Nie

The influence of interface friction and w/h ratio on the violence of coal specimen failure , Simon H. Prassetyo

Theses/Dissertations from 2010 2010

A risk management approach to pillar extraction in the Central Appalachian coalfields , Patrick R. Bucks

The Impacts of Longwall Mining on Groundwater Systems -- A Case of Cumberland Mine Panels B5 and B6 , Xinzhi Du

Evaluation of ultrafine spiral concentrators for coal cleaning , Meng Yang

Theses/Dissertations from 2009 2009

Development of a coal reserve GIS model and estimation of the recoverability and extraction costs , Chandrakanth Reddy Apala

Application and evaluation of spiral separators for fine coal cleaning , Zhuping Che

Weak floor stability in the Illinois Basin underground coal mines , Murali M. Gadde

Design of reinforced concrete seals for underground coal mines , Rajagopala Reddy Kallu

Employing laboratory physical modeling to study the radio imaging method (RIM) , Jun Lu

Influence of cutting sequence and time effects on cutters and roof falls in underground coal mine -- numerical approach , Anil Kumar Ray

Implementing energy release rate calculations into the LaModel program , Morgan M. Sears

Modeling PDC cutter rock interaction , Ihsan Berk Tulu

Analytical determination of strain energy for the studies of coal mine bumps , Qiang Xu

Improvement of the mine fire simulation program MFIRE , Lihong Zhou

Theses/Dissertations from 2008 2008

Program-assisted analysis of the transverse pressure capacity of block stoppings for mine ventilation control , Timothy J. Batchler

Analysis of factors affecting wireless communication systems in underground coal mines , David P. McGraw

Analysis of underground coal mine refuge shelters , Mickey D. Mitchell

Theses/Dissertations from 2007 2007

Dolomite flotation of high magnesium phosphate ores using fatty acid soap collectors , Zhengxing Gu

Evaluation of longwall face support hydraulic supply systems , Ted M. Klemetti II

Experimental studies of electromagnetic signals to enhance radio imaging method (RIM) , William D. Monaghan

Analysis of water monitoring data for longwall panels , Joseph R. Zirkle

Theses/Dissertations from 2006 2006

Measurements of the electrical properties of coal measure rocks , Nikolay D. Boykov

Geomechanical and weathering properties of weak roof shales in coal mines , Hakan Gurgenli

Assessment and evaluation of noise controls on roof bolting equipment and a method for predicting sound pressure levels in underground coal mining , Rudy J. Matetic

  • Collections
  • Disciplines
  • WVU Libraries
  • WVU Research Office
  • WVU Research Commons
  • Open Access @ WVU
  • Digital Publishing Institute

Advanced Search

  • Notify me via email or RSS

Author Corner

Home | About | FAQ | My Account | Accessibility Statement

Privacy Copyright

  • Our Promise
  • Our Achievements
  • Our Mission
  • Proposal Writing
  • System Development
  • Paper Writing
  • Paper Publish
  • Synopsis Writing
  • Thesis Writing
  • Assignments
  • Survey Paper
  • Conference Paper
  • Journal Paper
  • Empirical Paper
  • Journal Support
  • PhD Thesis on Data Mining

PhD Thesis on Data Mining is a platform to succeed in your thesis in a good way. In view of data mining, let’s first check the meaning of it shortly,  “Data mining is the step to discover the data-centric patterns in a large database.”

Today, it is a peak domain in the ML, DL, and AI!!!

Due to taking part in these concepts, data mining is the  most number-one  domain.  During that time comes to the thesis writing, your study must add with the sound of arguments establish the fact.

When it comes to thesis writing, your research must contribute with reasonable proofs to the research community. In order to safeguard your research in this stage, PhD Thesis on Data Mining simplified thesis writing with our brilliant writers.

In order to this stage, PhD Thesis on Data Mining is an easy step for your thesis writing.  When our thesis writing, you are studying must add with the soundproofs. Now a day, it is a high research field in the ML, DL, SL, DS/DL, and AI!!!

Simple steps for powerful PhD Thesis on Data Mining

  • First, share your cravings and requirements with us
  • Text mining
  • Multimedia mining
  • Graph mining etc.
  • University rules
  • The time limit for your thesis
  • Your past research works
  • Then, assign with a technical writer
  • Another, receive the structure
  • After approval, your thesis starts writing
  • As you get the first draft of your thesis

Our major goal line of the data mining thesis project is to extract apt knowledge from more complex mixed data sets. For that, we will perform the following practices on raw data sets. It may vary according to your requirements . If you have any queries/revision, then clarify it with our writer.


  • Data preprocessing
  • Missing values filling
  • Noisy data cleaning
  • Normalization
  • Aggregation
  • Discretization
  • Hierarchy generation
  • Generalization
  • Cube aggregation
  • Compression data
  • Dimensionality reduction
  • Optimal attribute subset selection
  • Mutual information
  • Optimization algorithms
  • Whale optimization
  • Spider Monkey Optimization
  • Ant lion colony optimization
  • Data analysis
  • Transitive heuristic algorithm
  • Expectation-Maximization
  • Fuzzy clustering
  • ML (ANN, as well as Decision trees, SVM, and PCA)
  • DL techniques (such as DNN, CNN, LSTM, and DBN)
  • Least square regression
  • Logistic regression
  • Lasso regression
  • Multivariate regression
  • Multiple regression

For the current students, thesis writing in a preferred format is a tough task. Same, we will think wisely while writing your thesis. Probably, a helpful friend will keep at the heart.

Use our PhD thesis on data mining like your friend to save your time and money. In addition, your stress will remove 100% at the PhD journey’s end.  Almost, we will double-check with the proofread research team!!!

Our research areas of data mining

  • Sentiment analysis
  • Social network analysis
  • Frequent item-set mining
  • Anomaly detection
  • Recommender systems
  • Semantic web mining
  • Mining using AI
  • Bio-medical diagnosis
  • Query search systems

Our competence tools for your data mining research

  • Rapid Miner
  • R-programming

To conclude our PhD thesis on data mining. Stay within our success zone. We will achieve great things in your research…

MILESTONE 1: Research Proposal

Finalize journal (indexing).

Before sit down to research proposal writing, we need to decide exact journals. For e.g. SCI, SCI-E, ISI, SCOPUS.

Research Subject Selection

As a doctoral student, subject selection is a big problem. has the team of world class experts who experience in assisting all subjects. When you decide to work in networking, we assign our experts in your specific area for assistance.

Research Topic Selection

We helping you with right and perfect topic selection, which sound interesting to the other fellows of your committee. For e.g. if your interest in networking, the research topic is VANET / MANET / any other

Literature Survey Writing

To ensure the novelty of research, we find research gaps in 50+ latest benchmark papers (IEEE, Springer, Elsevier, MDPI, Hindawi, etc.)

Case Study Writing

After literature survey, we get the main issue/problem that your research topic will aim to resolve and elegant writing support to identify relevance of the issue.

Problem Statement

Based on the research gaps finding and importance of your research, we conclude the appropriate and specific problem statement.

Writing Research Proposal

Writing a good research proposal has need of lot of time. We only span a few to cover all major aspects (reference papers collection, deficiency finding, drawing system architecture, highlights novelty)

MILESTONE 2: System Development

Fix implementation plan.

We prepare a clear project implementation plan that narrates your proposal in step-by step and it contains Software and OS specification. We recommend you very suitable tools/software that fit for your concept.

Tools/Plan Approval

We get the approval for implementation tool, software, programing language and finally implementation plan to start development process.

Pseudocode Description

Our source code is original since we write the code after pseudocodes, algorithm writing and mathematical equation derivations.

Develop Proposal Idea

We implement our novel idea in step-by-step process that given in implementation plan. We can help scholars in implementation.


We perform the comparison between proposed and existing schemes in both quantitative and qualitative manner since it is most crucial part of any journal paper.

Graphs, Results, Analysis Table

We evaluate and analyze the project results by plotting graphs, numerical results computation, and broader discussion of quantitative results in table.

Project Deliverables

For every project order, we deliver the following: reference papers, source codes screenshots, project video, installation and running procedures.

MILESTONE 3: Paper Writing

Choosing right format.

We intend to write a paper in customized layout. If you are interesting in any specific journal, we ready to support you. Otherwise we prepare in IEEE transaction level.

Collecting Reliable Resources

Before paper writing, we collect reliable resources such as 50+ journal papers, magazines, news, encyclopedia (books), benchmark datasets, and online resources.

Writing Rough Draft

We create an outline of a paper at first and then writing under each heading and sub-headings. It consists of novel idea and resources

Proofreading & Formatting

We must proofread and formatting a paper to fix typesetting errors, and avoiding misspelled words, misplaced punctuation marks, and so on

Native English Writing

We check the communication of a paper by rewriting with native English writers who accomplish their English literature in University of Oxford.

Scrutinizing Paper Quality

We examine the paper quality by top-experts who can easily fix the issues in journal paper writing and also confirm the level of journal paper (SCI, Scopus or Normal).

Plagiarism Checking

We at is 100% guarantee for original journal paper writing. We never use previously published works.

MILESTONE 4: Paper Publication

Finding apt journal.

We play crucial role in this step since this is very important for scholar’s future. Our experts will help you in choosing high Impact Factor (SJR) journals for publishing.

Lay Paper to Submit

We organize your paper for journal submission, which covers the preparation of Authors Biography, Cover Letter, Highlights of Novelty, and Suggested Reviewers.

Paper Submission

We upload paper with submit all prerequisites that are required in journal. We completely remove frustration in paper publishing.

Paper Status Tracking

We track your paper status and answering the questions raise before review process and also we giving you frequent updates for your paper received from journal.

Revising Paper Precisely

When we receive decision for revising paper, we get ready to prepare the point-point response to address all reviewers query and resubmit it to catch final acceptance.

Get Accept & e-Proofing

We receive final mail for acceptance confirmation letter and editors send e-proofing and licensing to ensure the originality.

Publishing Paper

Paper published in online and we inform you with paper title, authors information, journal name volume, issue number, page number, and DOI link

MILESTONE 5: Thesis Writing

Identifying university format.

We pay special attention for your thesis writing and our 100+ thesis writers are proficient and clear in writing thesis for all university formats.

Gathering Adequate Resources

We collect primary and adequate resources for writing well-structured thesis using published research articles, 150+ reputed reference papers, writing plan, and so on.

Writing Thesis (Preliminary)

We write thesis in chapter-by-chapter without any empirical mistakes and we completely provide plagiarism-free thesis.

Skimming & Reading

Skimming involve reading the thesis and looking abstract, conclusions, sections, & sub-sections, paragraphs, sentences & words and writing thesis chorological order of papers.

Fixing Crosscutting Issues

This step is tricky when write thesis by amateurs. Proofreading and formatting is made by our world class thesis writers who avoid verbose, and brainstorming for significant writing.

Organize Thesis Chapters

We organize thesis chapters by completing the following: elaborate chapter, structuring chapters, flow of writing, citations correction, etc.

Writing Thesis (Final Version)

We attention to details of importance of thesis contribution, well-illustrated literature review, sharp and broad results and discussion and relevant applications study.

How deal with significant issues ?

1. novel ideas.

Novelty is essential for a PhD degree. Our experts are bringing quality of being novel ideas in the particular research area. It can be only determined by after thorough literature search (state-of-the-art works published in IEEE, Springer, Elsevier, ACM, ScienceDirect, Inderscience, and so on). SCI and SCOPUS journals reviewers and editors will always demand “Novelty” for each publishing work. Our experts have in-depth knowledge in all major and sub-research fields to introduce New Methods and Ideas. MAKING NOVEL IDEAS IS THE ONLY WAY OF WINNING PHD.

2. Plagiarism-Free

To improve the quality and originality of works, we are strictly avoiding plagiarism since plagiarism is not allowed and acceptable for any type journals (SCI, SCI-E, or Scopus) in editorial and reviewer point of view. We have software named as “Anti-Plagiarism Software” that examines the similarity score for documents with good accuracy. We consist of various plagiarism tools like Viper, Turnitin, Students and scholars can get your work in Zero Tolerance to Plagiarism. DONT WORRY ABOUT PHD, WE WILL TAKE CARE OF EVERYTHING.

3. Confidential Info

We intended to keep your personal and technical information in secret and it is a basic worry for all scholars.

  • Technical Info: We never share your technical details to any other scholar since we know the importance of time and resources that are giving us by scholars.
  • Personal Info: We restricted to access scholars personal details by our experts. Our organization leading team will have your basic and necessary info for scholars.


4. Publication

Most of the PhD consultancy services will end their services in Paper Writing, but our is different from others by giving guarantee for both paper writing and publication in reputed journals. With our 18+ year of experience in delivering PhD services, we meet all requirements of journals (reviewers, editors, and editor-in-chief) for rapid publications. From the beginning of paper writing, we lay our smart works. PUBLICATION IS A ROOT FOR PHD DEGREE. WE LIKE A FRUIT FOR GIVING SWEET FEELING FOR ALL SCHOLARS.

5. No Duplication

After completion of your work, it does not available in our library i.e. we erased after completion of your PhD work so we avoid of giving duplicate contents for scholars. This step makes our experts to bringing new ideas, applications, methodologies and algorithms. Our work is more standard, quality and universal. Everything we make it as a new for all scholars. INNOVATION IS THE ABILITY TO SEE THE ORIGINALITY. EXPLORATION IS OUR ENGINE THAT DRIVES INNOVATION SO LET’S ALL GO EXPLORING.

Client Reviews

I ordered a research proposal in the research area of Wireless Communications and it was as very good as I can catch it.

I had wishes to complete implementation using latest software/tools and I had no idea of where to order it. My friend suggested this place and it delivers what I expect.

It really good platform to get all PhD services and I have used it many times because of reasonable price, best customer services, and high quality.

My colleague recommended this service to me and I’m delighted their services. They guide me a lot and given worthy contents for my research paper.

I’m never disappointed at any kind of service. Till I’m work with professional writers and getting lot of opportunities.

- Christopher

Once I am entered this organization I was just felt relax because lots of my colleagues and family relations were suggested to use this service and I received best thesis writing.

I recommend They have professional writers for all type of writing (proposal, paper, thesis, assignment) support at affordable price.

You guys did a great job saved more money and time. I will keep working with you and I recommend to others also.

These experts are fast, knowledgeable, and dedicated to work under a short deadline. I had get good conference paper in short span.

Guys! You are the great and real experts for paper writing since it exactly matches with my demand. I will approach again.

I am fully satisfied with thesis writing. Thank you for your faultless service and soon I come back again.

Trusted customer service that you offer for me. I don’t have any cons to say.

I was at the edge of my doctorate graduation since my thesis is totally unconnected chapters. You people did a magic and I get my complete thesis!!!

- Abdul Mohammed

Good family environment with collaboration, and lot of hardworking team who actually share their knowledge by offering PhD Services.

I enjoyed huge when working with PhD services. I was asked several questions about my system development and I had wondered of smooth, dedication and caring.

I had not provided any specific requirements for my proposal work, but you guys are very awesome because I’m received proper proposal. Thank you!

- Bhanuprasad

I was read my entire research proposal and I liked concept suits for my research issues. Thank you so much for your efforts.

- Ghulam Nabi

I am extremely happy with your project development support and source codes are easily understanding and executed.

Hi!!! You guys supported me a lot. Thank you and I am 100% satisfied with publication service.

- Abhimanyu

I had found this as a wonderful platform for scholars so I highly recommend this service to all. I ordered thesis proposal and they covered everything. Thank you so much!!!

Related Pages

Phd Projects Thesis Writing Help

Phd Projects In It

Phd Thesis On Electronics

Phd Projects In Wordnet

Phd Thesis On Big Data Analytics

Phd Projects In Weka

Phd Thesis On Image Processing

Phd Projects In Text Mining

Phd Thesis On Internet Of Things

Phd Projects In Java

Phd Projects In Rtool

Phd Thesis On Network Communication

Phd Projects In Load Balancing Cloud

Phd Thesis On Networking

Phd Projects In Learning Technologies

Suggestions or feedback?

MIT News | Massachusetts Institute of Technology

  • Machine learning
  • Social justice
  • Black holes
  • Classes and programs


  • Aeronautics and Astronautics
  • Brain and Cognitive Sciences
  • Architecture
  • Political Science
  • Mechanical Engineering

Centers, Labs, & Programs

  • Abdul Latif Jameel Poverty Action Lab (J-PAL)
  • Picower Institute for Learning and Memory
  • Lincoln Laboratory
  • School of Architecture + Planning
  • School of Engineering
  • School of Humanities, Arts, and Social Sciences
  • Sloan School of Management
  • School of Science
  • MIT Schwarzman College of Computing

Understanding the impacts of mining on local environments and communities

Press contact :.

A copper mining pit dug deep in the ground, surrounded by grooved rings of rock and flying dust.

Previous image Next image

Hydrosocial displacement refers to the idea that resolving water conflict in one area can shift the conflict to a different area. The concept was coined by Scott Odell, a visiting researcher in MIT’s Environmental Solutions Initiative (ESI). As part of ESI’s Program on Mining and the Circular Economy, Odell researches the impacts of extractive industries on local environments and communities, especially in Latin America. He discovered that hydrosocial displacements are often in regions where the mining industry is vying for use of precious water sources that are already stressed due to climate change.

Odell is working with John Fernández, ESI director and professor in the Department of Architecture, on a project that is examining the converging impacts of climate change, mining, and agriculture in Chile. The work is funded by a seed grant from MIT’s Abdul Latif Jameel Water and Food Systems Lab (J-WAFS). Specifically, the project seeks to answer how the expansion of seawater desalination by the mining industry is affecting local populations, and how climate change and mining affect Andean glaciers and the agricultural communities dependent upon them. By working with communities in mining areas, Odell and Fernández are gaining a sense of the burden that mining minerals needed for the clean energy transition is placing on local populations, and the types of conflicts that arise when water sources become polluted or scarce. This work is of particular importance considering over 100 countries pledged a commitment to the clean energy transition at the recent United Nations climate change conference, known as COP28.

Video thumbnail

Water, humanity’s lifeblood At the March 2023 United Nations (U.N.) Water Conference in New York, U.N. Secretary-General António Guterres warned “water is in deep trouble. We are draining humanity’s lifeblood through vampiric overconsumption and unsustainable use and evaporating it through global heating.” A quarter of the world’s population already faces “extremely high water stress,” according to the World Resources Institute. In an effort to raise awareness of major water-related issues and inspire action for innovative solutions, the U.N. created World Water Day, observed every year on March 22. This year’s theme is “Water for Peace,” underscoring the fact that even though water is a basic human right and intrinsic to every aspect of life, it is increasingly fought over as supplies dwindle due to problems including drought, overuse, and mismanagement.  

The “Water for Peace” theme is exemplified in Fernández and Odell’s J-WAFS project, where findings are intended to inform policies to reduce social and environmental harms inflicted on mining communities and their limited water sources. “Despite broad academic engagement with mining and climate change separately, there has been a lack of analysis of the societal implications of the interactions between mining and climate change,” says Odell. “This project is helping to fill the knowledge gap. Results will be summarized in Spanish and English and distributed to interested and relevant parties in Chile, ensuring that the results can be of benefit to those most impacted by these challenges,” he adds.

The effects of mining for the clean energy transition

Global climate change is understood to be the most pressing environmental issue facing humanity today. Mitigating climate change requires reducing carbon emissions by transitioning away from conventional energy derived from burning fossil fuels, to more sustainable energy sources like solar and wind power. Because copper is an excellent conductor of electricity, it will be a crucial element in the clean energy transition, in which more solar panels, wind turbines, and electric vehicles will be manufactured. “We are going to see a major increase in demand for copper due to the clean energy transition,” says Odell.

In 2021, Chile produced 26 percent of the world's copper, more than twice as much as any other country, Odell explains. Much of Chile’s mining is concentrated in and around the Atacama Desert — the world’s driest desert. Unfortunately, mining requires large amounts of water for a variety of processes, including controlling dust at the extraction site, cooling machinery, and processing and transporting ore.

Chile is also one of the world’s largest exporters of agricultural products. Farmland is typically situated in the valleys downstream of several mines in the high Andes region, meaning mines get first access to water. This can lead to water conflict between mining operations and agricultural communities. Compounding the problem of mining for greener energy materials to combat climate change, are the very effects of climate change. According to the Chilean government, the country has suffered 13 years of the worst drought in history. While this is detrimental to the mining industry, it is also concerning for those working in agriculture, including the Indigenous Atacameño communities that live closest to the Escondida mine, the largest copper mine in the world. “There was never a lot of water to go around, even before the mine,” Odell says. The addition of Escondida stresses an already strained water system, leaving Atacameño farmers and individuals vulnerable to severe water insecurity.

What’s more, waste from mining, known as tailings, includes minerals and chemicals that can contaminate water in nearby communities if not properly handled and stored. Odell says the secure storage of tailings is a high priority in earthquake-prone Chile. “If an earthquake were to hit and damage a tailings dam, it could mean toxic materials flowing downstream and destroying farms and communities,” he says.

Chile’s treasured glaciers are another piece of the mining, climate change, and agricultural puzzle. Caroline White-Nockleby, a PhD candidate in MIT’s Program in Science, Technology, and Society, is working with Odell and Fernández on the J-WAFS project and leading the research specifically on glaciers. “These may not be the picturesque bright blue glaciers that you might think of, but they are, nonetheless, an important source of water downstream,” says White-Nockleby. She goes on to explain that there are a few different ways that mines can impact glaciers.

In some cases, mining companies have proposed to move or even destroy glaciers to get at the ore beneath. Other impacts include dust from mining that falls on glaciers. White-Nockleby says, “this makes the glaciers a darker color, so, instead of reflecting the sun's rays away, [the glacier] may absorb the heat and melt faster.” This shows that even when not directly intervening with glaciers, mining activities can cause glacial decline, adding to the threat glaciers already face due to climate change. She also notes that “glaciers are an important water storage facility,” describing how, on an annual cycle, glaciers freeze and melt, allowing runoff that downstream agricultural communities can utilize. If glaciers suddenly melt too quickly, flooding of downstream communities can occur.

Desalination offers a possible, but imperfect, solution

Chile’s extensive coastline makes it uniquely positioned to utilize desalination — the removal of salts from seawater — to address water insecurity. Odell says that “over the last decade or so, there's been billions of dollars of investments in desalination in Chile.”

As part of his dissertation work at Clark University, Odell found broad optimism in Chile for solving water issues in the mining industry through desalination. Not only was the mining industry committed to building desalination plants, there was also political support, and support from some community members in highland communities near the mines. Yet, despite the optimism and investment, desalinated water was not replacing the use of continental water. He concluded that “desalination can’t solve water conflict if it doesn't reduce demand for continental water supplies.”

However, after publishing those results , Odell learned that new estimates at the national level showed that desalination operations had begun to replace the use of continental water after 2018. In two case studies that he currently focuses on — the Escondida and Los Pelambres copper mines — the mining companies have expanded their desalination objectives in order to reduce extraction from key continental sources. This seems to be due to a variety of factors. For one thing, in 2022, Chile’s water code was reformed to prioritize human water consumption and environmental protection of water during scarcity and in the allocation of future rights. It also shortened the granting of water rights from “in perpetuity” to 30 years. Under this new code, it is possible that the mining industry may have expanded its desalination efforts because it viewed continental water resources as less secure, Odell surmises.

As part of the J-WAFS project, Odell has found that recent reactions have been mixed when it comes to the rapid increase in the use of desalination. He spent over two months doing fieldwork in Chile by conducting interviews with members of government, industry, and civil society at the Escondida, Los Pelambres, and Andina mining sites, as well as in Chile’s capital city, Santiago. He has spoken to local and national government officials, leaders of fishing unions, representatives of mining and desalination companies, and farmers. He observed that in the communities where the new desalination plants are being built, there have been concerns from community members as to whether they will get access to the desalinated water, or if it will belong solely to the mines.

Interviews at the Escondida and Los Pelambres sites, in which desalination operations are already in place or under construction, indicate acceptance of the presence of desalination plants combined with apprehension about unknown long-term environmental impacts. At a third mining site, Andina, there have been active protests against a desalination project that would supply water to a neighboring mine, Los Bronces. In that community, there has been a blockade of the desalination operation by the fishing federation. “They were blockading that operation for three months because of concerns over what the desalination plant would do to their fishing grounds,” Odell says. And this is where the idea of hydrosocial displacement comes into the picture, he explains. Even though desalination operations are easing tensions with highland agricultural communities, new issues are arising for the communities on the coast. “We can't just look to desalination to solve our problems if it's going to create problems somewhere else” Odell advises.

Within the process of hydrosocial displacement, interacting geographical, technical, economic, and political factors constrain the range of responses to address the water conflict. For example, communities that have more political and financial power tend to be better equipped to solve water conflict than less powerful communities. In addition, hydrosocial concerns usually follow the flow of water downstream, from the highlands to coastal regions. Odell says that this raises the need to look at water from a broader perspective.

“We tend to address water concerns one by one and that can, in practice, end up being kind of like whack-a-mole,” says Odell. “When we think of the broader hydrological system, water is very much linked, and we need to look across the watershed. We can't just be looking at the specific community affected now, but who else is affected downstream, and will be affected in the long term. If we do solve a water issue by moving it somewhere else, like moving a tailings dam somewhere else, or building a desalination plant, resources are needed in the receiving community to respond to that,” suggests Odell.

The company building the desalination plant and the fishing federation ultimately reached an agreement and the desalination operation will be moving forward. But Odell notes, “the protest highlights concern about the impacts of the operation on local livelihoods and environments within the much larger context of industrial pollution in the area.”

The power of communities

The protest by the fishing federation is one example of communities coming together to have their voices heard. Recent proposals by mining companies that would affect glaciers and other water sources used by agriculture communities have led to other protests that resulted in new agreements to protect local water supplies and the withdrawal of some of the mining proposals. Odell observes that communities have also gone to the courts to raise their concerns. The Atacameño communities, for example, have drawn attention to over-extraction of water resources by the Escondida mine. “Community members are also pursuing education in these topics so that there's not such a power imbalance between mining companies and local communities,” Odell remarks. This demonstrates the power local communities can have to protect continental water resources. The political and social landscape of Chile may also be changing in favor of local communities. Beginning with what is now referred to as the Estallido Social (social outburst) over inequality in 2019, Chile has undergone social upheaval that resulted in voters calling for a new constitution. Gabriel Boric, a progressive candidate, whose top priorities include social and environmental issues, was elected president during this period. These trends have brought major attention to issues of economic inequality, environmental harms of mining, and environmental justice, which is putting pressure on the mining industry to make a case for its operations in the country, and to justify the environmental costs of mining.

What happens after the mine dries up?

From his fieldwork interviews, Odell has learned that the development of mines within communities can offer benefits. Mining companies typically invest directly in communities through employment, road construction, and sometimes even by building or investing in schools, stadiums, or health clinics. Indirectly, mines can have spillover effects in the economy since miners might support local restaurants, hotels, or stores. But what happens when the mine closes? As one community member Odell interviewed stated: “When the mine is gone, what are we going to have left besides a big hole in the ground?”

Odell suggests that a multi-pronged approach should be taken to address the future state of water and mining. First, he says we need to have broader conversations about the nature of our consumption and production at domestic and global scales. “Mining is driven indirectly by our consumption of energy and directly by our consumption of everything from our buildings to devices to cars,” Odell states. “We should be looking for ways to moderate our consumption and consume smarter through both policy and practice so that we don’t solve climate change while creating new environmental harms through mining.” One of the main ways we can do this is by advancing the circular economy by recycling metals already in the system, or even in landfills, to help build our new clean energy infrastructure. Even so, the clean energy transition will still require mining, but according to Odell, that mining can be done better. “Mining companies and government need to do a better job of consulting with communities. We need solid plans and financing for mine closures in place from the beginning of mining operations, so that when the mine dries up, there's the money needed to secure tailings dams and protect the communities who will be there forever,” Odell concludes. Overall, it will take an engaged society — from the mining industry to government officials to individuals — to think critically about the role we each play in our quest for a more sustainable planet, and what that might mean for the most vulnerable populations among us.

Share this news article on:

Related links.

  • Abdul Latif Jameel Water and Food Systems Lab (J-WAFS)
  • Environmental Solutions Initiative
  • Program in Science, Technology, and Society
  • History, Anthropology, and Science, Technology, and Society (HASTS)
  • Department of Architecture
  • School of Architecture and Planning

Related Topics

  • Desalination
  • Climate change
  • Cleaner industry
  • Sustainability
  • Manufacturing
  • Agriculture
  • Environment
  • Latin America
  • Program in STS
  • School of Humanities Arts and Social Sciences

Related Articles

Runako Gentles and Shiv Bhakta pose together in front of the entrance to Expo City Dubai, a sand-colored building featuring a domed central structure. Flags of many nations fly in front of the building.

Reflecting on COP28 — and humanity’s progress toward meeting global climate goals

Yuri Sebregts speaks at a lectern

Meeting the clean energy needs of tomorrow

Online conference screen featuring three individuals: Deanna Kemp, Rohitesh Dhawan, and Scott Odell

Mining for the clean energy transition

Headhshots of Gang Chen, Heather Kulik, Gregory Rutledge, César Terrer, John Fernández, Scott Odell, Ariel Furst, and Michael Triantafyllou. In the center is a logo for MIT J-WAFS

MIT J-WAFS announces 2022 seed grant recipients

Previous item Next item

More MIT News

Rafael Jaramillo sits in his office and looks to the side. A large wrench sits on the window sill. The desk is covered in white paper with many drawings and notes on it.

“Life is short, so aim high”

Read full story →

Oil field rigs overlayed with analytics data

Shining a light on oil fields to make them more sustainable

Three close up photos of speakers at a conference: Julie Shah, Ben Armstrong, and Kate Kellogg

MIT launches Working Group on Generative AI and the Work of the Future

Two men in hardhats and safety vests, seen from behind, inspect a forest of electrical pylons and wires on a cloudless day

Atmospheric observations in China show rise in emissions of a potent greenhouse gas

A view of the steps and columns of 77 Mass Ave, as seen through The Alchemist Sculpture. Glimpses of the numbers and mathematical symbols are seen around the image.

Second round of seed grants awarded to MIT scholars studying the impact and applications of generative AI

A view from behind of about 15 people dressed head-to-toe in white cleanroom suits, facing another gowned-up individual gesturing as they speak.

VIAVI Solutions joins MIT.nano Consortium

  • More news on MIT News homepage →

Massachusetts Institute of Technology 77 Massachusetts Avenue, Cambridge, MA, USA

  • Map (opens in new window)
  • Events (opens in new window)
  • People (opens in new window)
  • Careers (opens in new window)
  • Accessibility
  • Social Media Hub
  • MIT on Facebook
  • MIT on YouTube
  • MIT on Instagram
  • iSchool Connect

Get to know Kyrie Zhixuan Zhou, PhD student

PhD student Kyrie Zhixuan Zhou's goal is to make information and communication technology (ICT) and artificial intelligence (AI) experiences more equitable, accessible, beneficial, and ethical for all. In his free time, he is devoted to helping junior researchers, especially those from populations not typically represented in STEM.

Why did you decide to pursue a degree in information sciences?

My bachelor's degree is in computer science, but gradually I realized my real passion is not pursuing state-of-the-art algorithms and computer systems but how they can be leveraged responsibly for human well-being.

When I was growing up, I witnessed how women's rights were overlooked, people with disabilities were invisible, and people in general were being censored and surveilled. These experiences motivated me to understand, design, and govern ICT/AI experience for social good, with a focus on vulnerable populations and from a human-centered perspective. A degree in information sciences allows me to pursue this research in collaboration with colleagues who have a similar passion.

Why did you choose the iSchool at Illinois?

Since I was young, the University of Illinois has been my dream university. Due to its reputation, the iSchool at Illinois was at the top of my list of information sciences when I was applying for graduate school.

Most importantly, I was fascinated by the interdisciplinary and exciting research conducted by iSchool faculty. Professor Stephen Downie's research on music and Assistant Professor Melissa Ocepek's research on food were refreshing to me back then. Now, working with the best advisors one can have in the world—Assistant Professor Madelyn Sanfilippo and Associate Professor Rachel Adler—I am able to explore my research interests to the fullest extent. I think I made the right choice.

What are your research interests?

My interests are broadly in technology accessibility, ethics, and education. I aspire to design, govern, and teach about ICT/AI experience for vulnerable populations. In my research, I leverage qualitative, quantitative, and design methods to gain deeper insights into the interaction between humans and technologies as well as how technologies result in societal impact. I'm also keen on proposing policy recommendations to turn research insights into practice. My ultimate goal is to make the ICT/AI experience more equitable, accessible, beneficial, and ethical for all.

What do you do outside of class?

I have been devoted to helping junior researchers thrive in their research, academic, and career development. Bridging the research divide for STEM students from rural areas, developing countries, and marginalized populations gives me the most satisfaction.

My biggest hobby is basketball. I play basketball, watch basketball games, and play basketball video games. I spend a lot of time with my dog Yinhe ("galaxy" in Chinese) and cat Mei-mei ("little sister" in Chinese).

What career plans or goals do you have?

I can see myself taking two paths. First is academia, as I love research, teaching, and mentoring. Second is industry, as I want to build technologies that really help vulnerable populations. I think some mixture of these two is likely.

  • Get to know
  • student news
  • student profile


  1. Research PhD Thesis on Data Mining (Thesis Writing Service)

    data mining thesis phd

  2. PhD Thesis Topics in Data Mining (Thesis Writing Help)

    data mining thesis phd

  3. PhD thesis defense

    data mining thesis phd

  4. PhD Thesis on Data Mining Projects (Thesis Writing)

    data mining thesis phd

  5. Trending Top 10 Data Mining Thesis Topics [How to Choose Novel Idea]

    data mining thesis phd

  6. Latest Data Mining Research and Thesis Topic Guidance For M.Tech and

    data mining thesis phd


  1. Crypto Earns

  2. Sirekap KPU

  3. PhD Thesis Defense. Vadim Sotskov

  4. Blockstream’s Bitcoin Mining Thesis

  5. Data mining masters thesis ! Post graduation research thesis


  1. Trending Data Mining Thesis Topics

    Integration of MapReduce, Amazon EC2, S3, Apache Spark, and Hadoop into data mining. These are the recent trends in data mining. We insist that you choose one of the topics that interest you the most. Having an appropriate content structure or template is essential while writing a thesis.

  2. (PDF) Trends in data mining research: A two-decade review using topic

    Abstract and Figures. This work analyses the intellectual structure of data mining as a scientific discipline. To do this, we use topic analysis (namely, latent Dirichlet allocation, DLA) applied ...

  3. data mining Latest Research Papers

    The accurate average value is 74.05% of the existing COID algorithm, and our proposed algorithm has 77.21%. The average recall value is 81.19% and 89.51% of the existing and proposed algorithm, which shows that the proposed work efficiency is better than the existing COID algorithm. Download Full-text.

  4. PhD in Data Science

    PhD in Analytics and Data Science. Students pursuing a PhD in analytics and data science at Kennesaw State University must complete 78 credit hours: 48 course hours and 6 electives (spread over 4 years of study), a minimum 12 credit hours for dissertation research, and a minimum 12 credit-hour internship.

  5. PhD Research Topics in Data Mining

    In recent times, there is a massive growth in information generation through "IoT.". At the same time, it stores in "Cloud Computing.". PhD Research Topics in Data Mining is the academic stock of hot topics. It intends to convert our line of thoughts to your research As a result, it ' opens the way for research in data mining.'.


    of data mining applications are not considered, and the internal mechanisms of models are not fully explored. Meanwhile, it is unknown how to utilize interpretation to improve models. To bridge the gap, I developed a series of interpretation methods that gradually increase the transparency of data mining models.

  7. Theses of the doctoral ( PhD ) dissertation Data mining and soft

    Theses of the doctoral ( PhD ) dissertation Data mining and soft computing algorithms for decision support systems @inproceedings{Kirly2013ThesesOT, title={Theses of the doctoral ( PhD ) dissertation Data mining and soft computing algorithms for decision support systems}, author={Andr{\'a}s Kir{\'a}ly}, year={2013}, url={https://api ...


    Moreover, data mining requires following these steps: Model Validation. 5. 3. patterns from the data mining algorithms. All the output patterns that are found by. data mining algorithms are not ...

  9. data mining PhD Projects, Programmes & Scholarships

    Optimisation of additive manufacturing process using data-driven machine-learning approach (Fully Funded PhD) University College London Department of Mechanical Engineering. Lead supervisor: Dr Chu Lun Alex Leung. Eligibility: Open to UK students and international students. Fully Funded. 3 years of Home tuition fees (currently £5,860/year) and ...

  10. PDF Dataset Proximity Mining for Supporting Schema Matching and Data Lake

    Thesis submitted: November, 2020 PhD Supervisors: Prof. Alberto Abelló ... qui consomme les métadonnées extraites via des algorithmes de data mining automatisés afin de détecter des datasets connexes et de leur proposer des: Data Lake: Dataset: Proximity. Dataset Proximity Mining for Supporting Schema Matching

  11. PDF Novel Data Mining Algorithms for Analysis of Electronic Health Records

    Data mining is the process of discovering interesting knowledge or information from large amounts of data. Electronic health records (EHRs) are complex large datasets describing the health status and treatment of patients for each of their encounters with a health system. EHRs contain both structured (i.e. lab values, medical diagnosis codes ...

  12. Data Mining Dissertation Topics

    Data Mining Dissertation Topics. The term "data mining" refers to an intelligent data lookup capacity that uses statistics-based algorithms and methodologies to find trends, patterns, links, and correlations within the collected data and records. Audio, Pictorial, Video, textual, online, and social media-based mining are only a few examples ...

  13. Dissertations / Theses: 'Data mining'

    Specifically, this Doctoral dissertation presents three novel, fast and scalable data mining algorithms well-suited to analyze large sets of complex data: the method Halite for correlation clustering; the method BoW for clustering Terabyte-scale datasets; and the method QMAS for labeling and summarization.

  14. A Study of Heart Disease Diagnosis Using Machine Learning and Data Mining

    3) Machine Learning algorithms allowed us to analyze clinical data, draw. relationships between diagnostic variables, design the predictive model, and. tests it against the new case. The predictive model achieved an accuracy of 89.4. percent using RandomForest Classifier's default setting to predict heart diseases.

  15. PDF Lewis Adam Whitley April, 2018 Director of Thesis: Qin Ding, Ph.D

    EDUCATIONAL DATA MINING AND ITS USES TO PREDICT THE MOST PROSPEROUS LEARNING ENVIRONMENT by Lewis Adam Whitley April, 2018 Director of Thesis: Qin Ding, Ph.D. Major Department: Department of Computer Science The use of technology and data analysis within the classroom has been a resourceful tool

  16. PDF Machine learning and data mining for yeast functional genomics

    This thesis presents an investigation into machine learning and data mining meth-ods that can be used on data from the Saccharomyces cerevisiae genome. The aim is to predict functional class for ORFs (Open Reading Frames) whose function is currently unknown. Analysis of the yeast genome provides many challenges to existing computa-tional ...

  17. PhD Projects in Data Mining [Top 15 Trending Research Area]

    Most Researched Data Mining Topics in Current Days. Graph Mining for Malware Detection. Data Assimilation by Neural Networks. Task-Oriented Pattern Mining. Web Mining. Big Data Mining. Cyber Security for Massive Data. 5G Technology. Software Defined Networking.

  18. Latest Research and Thesis topics in Data Mining

    Topics to study in data mining. Data mining is a relatively new thing and many are not aware of this technology. This can also be a good topic for M.Tech thesis and for presentations. Following are the topics under data mining to study: Fraud Detection. Crime Rate Prediction.

  19. Open Theses

    Open Topics We offer multiple Bachelor/Master theses, Guided Research projects and IDPs in the area of data mining/machine learning. A non-exhaustive list of open topics is listed below.. If you are interested in a thesis or a guided research project, please send your CV and transcript of records to Prof. Stephan Günnemann via email and we will arrange a meeting to talk about the potential ...

  20. Data Mining

    Anomaly detection in text data using deep generative models Author: Esposito, R., 30 Aug 2019. Supervisor: Menkovski, V. (Supervisor 1) & van Ipenburg, W. (External person) (External coach) Student thesis: Master

  21. PhD Thesis Topics in Data Mining

    Thesis Topics in Data Mining. PhD Thesis Topics in Data Mining presents beneficial information about your data mining research area. We also offer guidance support online and offline also for your convenience. Data mining is the process of discovering patterns and provides necessary information from the large scale dataset We also often provide ...

  22. Mining Engineering Graduate Theses and Dissertations

    Truck Cycle and Delay Automated Data Collection System (TCD-ADCS) for Surface Coal Mining, Patricio G. Terrazas Prado. PDF. New Abutment Angle Concept for Underground Coal Mining, Ihsan Berk Tulu. Theses/Dissertations from 2011 PDF. Experimental analysis of the post-failure behavior of coal and rock under laboratory compression tests, Dachao ...

  23. PhD Thesis on Data Mining

    PhD Thesis on Data Mining is a platform to succeed in your thesis in a good way. In view of data mining, let's first check the meaning of it shortly, "Data mining is the step to discover the data-centric patterns in a large database." Today, it is a peak domain in the ML, DL, and AI!!! Due to taking part in these concepts, data mining is ...

  24. Understanding the impacts of mining on local environments and

    As part of his dissertation work at Clark University, Odell found broad optimism in Chile for solving water issues in the mining industry through desalination. Not only was the mining industry committed to building desalination plants, there was also political support, and support from some community members in highland communities near the mines.

  25. Get to know Kyrie Zhixuan Zhou, PhD student

    PhD student Kyrie Zhixuan Zhou's goal is to make information and communication technology (ICT) and artificial intelligence (AI) experiences more equitable, accessible, beneficial, and ethical for all. In his free time, he is devoted to helping junior researchers, especially those from populations not typically represented in STEM.