Learning Inductive Representations of Biomedical Data

Learning Inductive Representations of Biomedical Data
Author: Samuel G. Finlayson
Publisher:
Total Pages: 189
Release: 2020
Genre: Artificial intelligence
ISBN:


Download Learning Inductive Representations of Biomedical Data Book in PDF, Epub and Kindle

Representation learning with neural networks has catalyzed rapid progress in biomedical pattern recognition. This progress, however, has generally been limited to domains where data are abundant, richly structured, and stable. In contrast, much of biomedicine is marked by limited and poorly structured data and by highly dynamic deployment environments. In particular, many of the most compelling problem areas in biomedicine involve the "long tails" of rare diseases and rare events. In this thesis, I confront the challenge of learning data representations whose utility can extend into dynamic and data-poor biomedical domains. I do so through three primary projects: First, I present a novel method for representation learning with subgraphs. This method, called Subgraph Neural Networks (Sub-GNN), learns disentangled representations of subgraph structure, neighborhood, and position through property-aware routing channels. The work is motivated by the desire for methods that can better contextualize patient phenotypes (encoded as subgraphs) into the broader context of biomedical knowledge, which could allow for better diagnostic generalization to novel disorders involving previously unseen phenotypes. Subgraph neural networks provide a principled framework for doing just this, by leveraging the relational inductive biases of the underlying knowledge graph while still respecting subgraphs as independent entities. Next, I present an approach to learning coordinated representations of small molecules and their associated transcriptional signatures. This approach extends a popular paradigm for drug development (known as connectivity mapping) to operate inductively, making predictions involving drugs that have not previously been experimentally assayed. I benchmark the performance of this approach, studying the circumstances under which it can and cannot achieve strong performance. Finally, I present an analysis of the clinical challenges posed by dataset shift, the phenomenon in which the input data to a deployed machine learning algorithm become mismatched with its training data. After introducing the problem of general dataset shift, I turn to a special case—adversarial examples—which reflect the worst-case generalization conditions for a machine learning system. I then build and test the representational robustness of three high-accuracy machine learning systems, constructing adversarial examples that cause their accuracy to drop to 0% on data that is imperceptibly different from the training data. I discuss the implications of these findings for clinical machine learning, offering specific regulatory recommendations. I conclude my thesis with lessons learned from these projects, and provide an extensive appendix with three additional smaller-scale projects that branched off of my research.

On Leveraging Representation Learning Techniques for Data Analytics in Biomedical Informatics

On Leveraging Representation Learning Techniques for Data Analytics in Biomedical Informatics
Author: Xi Hang Cao
Publisher:
Total Pages: 124
Release: 2019
Genre:
ISBN:


Download On Leveraging Representation Learning Techniques for Data Analytics in Biomedical Informatics Book in PDF, Epub and Kindle

Representation Learning is ubiquitous in state-of-the-art machine learning workflow, including data exploration/visualization, data preprocessing, data model learning, and model interpretations. However, the majority of the newly proposed Representation Learning methods are more suitable for problems with a large amount of data. Applying these methods to problems with a limited amount of data may lead to unsatisfactory performance. Therefore, there is a need for developing Representation Learning methods which are tailored for problems with ``small data", such as, clinical and biomedical data analytics. In this dissertation, we describe our studies of tackling the challenging clinical and biomedical data analytics problem from four perspectives: data preprocessing, temporal data representation learning, output representation learning, and joint input-output representation learning. Data scaling is an important component in data preprocessing. The objective in data scaling is to scale/transform the raw features into reasonable ranges such that each feature of an instance will be equally exploited by the machine learning model. For example, in a credit flaw detection task, a machine learning model may utilize a person's credit score and annual income as features, but because the ranges of these two features are different, a machine learning model may consider one more heavily than another. In this dissertation, I thoroughly introduce the problem in data scaling and describe an approach for data scaling which can intrinsically handle the outlier problem and lead to better model prediction performance. Learning new representations for data in the unstandardized form is a common task in data analytics and data science applications. Usually, data come in a tubular form, namely, the data is represented by a table in which each row is a feature (row) vector of an instance. However, it is also common that the data are not in this form; for example, texts, images, and video/audio records. In this dissertation, I describe the challenge of analyzing imperfect multivariate time series data in healthcare and biomedical research and show that the proposed method can learn a powerful representation to encounter various imperfections and lead to an improvement of prediction performance. Learning output representations is a new aspect of Representation Learning, and its applications have shown promising results in complex tasks, including computer vision and recommendation systems. The main objective of an output representation algorithm is to explore the relationship among the target variables, such that a prediction model can efficiently exploit the similarities and potentially improve prediction performance. In this dissertation, I describe a learning framework which incorporates output representation learning to time-to-event estimation. Particularly, the approach learns the model parameters and time vectors simultaneously. Experimental results do not only show the effectiveness of this approach but also show the interpretability of this approach from the visualizations of the time vectors in 2-D space. Learning the input (feature) representation, output representation, and predictive modeling are closely related to each other. Therefore, it is a very natural extension of the state-of-the-art by considering them together in a joint framework. In this dissertation, I describe a large-margin ranking-based learning framework for time-to-event estimation with joint input embedding learning, output embedding learning, and model parameter learning. In the framework, I cast the functional learning problem to a kernel learning problem, and by adopting the theories in Multiple Kernel Learning, I propose an efficient optimization algorithm. Empirical results also show its effectiveness on several benchmark datasets.

Learning with Limited Labeled Data in Biomedical Domain by Disentanglement and Semi-supervised Learning

Learning with Limited Labeled Data in Biomedical Domain by Disentanglement and Semi-supervised Learning
Author: Prashnna Kumar Gyawali
Publisher:
Total Pages: 129
Release: 2021
Genre: Machine learning
ISBN:


Download Learning with Limited Labeled Data in Biomedical Domain by Disentanglement and Semi-supervised Learning Book in PDF, Epub and Kindle

"In this dissertation, we are interested in improving the generalization of deep neural networks for biomedical data (e.g., electrocardiogram signal, x-ray images, etc). Although deep neural networks have attained state-of-the-art performance and, thus, deployment across a variety of domains, similar performance in the clinical setting remains challenging due to its ineptness to generalize across unseen data (e.g., new patient cohort). We address this challenge of generalization in the deep neural network from two perspectives: 1) learning disentangled representations from the deep network, and 2) developing efficient semi-supervised learning (SSL) algorithms using the deep network. In the former, we are interested in designing specific architectures and objective functions to learn representations, where variations in the data are well separated, i.e., disentangled. In the latter, we are interested in designing regularizers that encourage the underlying neural function's behavior toward a common inductive bias to avoid over-fitting the function to small labeled data. Our end goal is to improve the generalization of the deep network for the diagnostic model in both of these approaches. In disentangled representations, this translates to appropriately learning latent representations from the data, capturing the observed input's underlying explanatory factors in an independent and interpretable way. With data's expository factors well separated, such disentangled latent space can then be useful for a large variety of tasks and domains within data distribution even with a small amount of labeled data, thus improving generalization. In developing efficient semi-supervised algorithms, this translates to utilizing a large volume of the unlabelled dataset to assist the learning from the limited labeled dataset, commonly encountered situation in the biomedical domain. By drawing ideas from different areas within deep learning like representation learning (e.g., autoencoder), variational inference (e.g., variational autoencoder), Bayesian nonparametric (e.g., beta-Bernoulli process), learning theory (e.g., analytical learning theory), function smoothing (Lipschitz Smoothness), etc., we propose several leaning algorithms to improve generalization in the associated task. We test our algorithms on real-world clinical data and show that our approach yields significant improvement over existing methods. Moreover, we demonstrate the efficacy of the proposed models in the benchmark data and simulated data to understand different aspects of the proposed learning methods. We conclude by identifying some of the limitations of the proposed methods, areas of further improvement, and broader future directions for the successful adoption of AI models in the clinical environment."--Abstract.

Deep Learning for Biomedical Data Analysis

Deep Learning for Biomedical Data Analysis
Author: Mourad Elloumi
Publisher: Springer Nature
Total Pages: 358
Release: 2021-07-13
Genre: Medical
ISBN: 3030716767


Download Deep Learning for Biomedical Data Analysis Book in PDF, Epub and Kindle

This book is the first overview on Deep Learning (DL) for biomedical data analysis. It surveys the most recent techniques and approaches in this field, with both a broad coverage and enough depth to be of practical use to working professionals. This book offers enough fundamental and technical information on these techniques, approaches and the related problems without overcrowding the reader's head. It presents the results of the latest investigations in the field of DL for biomedical data analysis. The techniques and approaches presented in this book deal with the most important and/or the newest topics encountered in this field. They combine fundamental theory of Artificial Intelligence (AI), Machine Learning (ML) and DL with practical applications in Biology and Medicine. Certainly, the list of topics covered in this book is not exhaustive but these topics will shed light on the implications of the presented techniques and approaches on other topics in biomedical data analysis. The book finds a balance between theoretical and practical coverage of a wide range of issues in the field of biomedical data analysis, thanks to DL. The few published books on DL for biomedical data analysis either focus on specific topics or lack technical depth. The chapters presented in this book were selected for quality and relevance. The book also presents experiments that provide qualitative and quantitative overviews in the field of biomedical data analysis. The reader will require some familiarity with AI, ML and DL and will learn about techniques and approaches that deal with the most important and/or the newest topics encountered in the field of DL for biomedical data analysis. He/she will discover both the fundamentals behind DL techniques and approaches, and their applications on biomedical data. This book can also serve as a reference book for graduate courses in Bioinformatics, AI, ML and DL. The book aims not only at professional researchers and practitioners but also graduate students, senior undergraduate students and young researchers. This book will certainly show the way to new techniques and approaches to make new discoveries.

Graph Representation Learning

Graph Representation Learning
Author: William L. William L. Hamilton
Publisher: Springer Nature
Total Pages: 141
Release: 2022-06-01
Genre: Computers
ISBN: 3031015886


Download Graph Representation Learning Book in PDF, Epub and Kindle

Graph-structured data is ubiquitous throughout the natural and social sciences, from telecommunication networks to quantum chemistry. Building relational inductive biases into deep learning architectures is crucial for creating systems that can learn, reason, and generalize from this kind of data. Recent years have seen a surge in research on graph representation learning, including techniques for deep graph embeddings, generalizations of convolutional neural networks to graph-structured data, and neural message-passing approaches inspired by belief propagation. These advances in graph representation learning have led to new state-of-the-art results in numerous domains, including chemical synthesis, 3D vision, recommender systems, question answering, and social network analysis. This book provides a synthesis and overview of graph representation learning. It begins with a discussion of the goals of graph representation learning as well as key methodological foundations in graph theory and network analysis. Following this, the book introduces and reviews methods for learning node embeddings, including random-walk-based methods and applications to knowledge graphs. It then provides a technical synthesis and introduction to the highly successful graph neural network (GNN) formalism, which has become a dominant and fast-growing paradigm for deep learning with graph data. The book concludes with a synthesis of recent advancements in deep generative models for graphs—a nascent but quickly growing subset of graph representation learning.

Biomedical Data Mining for Information Retrieval

Biomedical Data Mining for Information Retrieval
Author: Sujata Dash
Publisher: John Wiley & Sons
Total Pages: 450
Release: 2021-08-06
Genre: Computers
ISBN: 1119711266


Download Biomedical Data Mining for Information Retrieval Book in PDF, Epub and Kindle

BIOMEDICAL DATA MINING FOR INFORMATION RETRIEVAL This book not only emphasizes traditional computational techniques, but discusses data mining, biomedical image processing, information retrieval with broad coverage of basic scientific applications. Biomedical Data Mining for Information Retrieval comprehensively covers the topic of mining biomedical text, images and visual features towards information retrieval. Biomedical and health informatics is an emerging field of research at the intersection of information science, computer science, and healthcare and brings tremendous opportunities and challenges due to easily available and abundant biomedical data for further analysis. The aim of healthcare informatics is to ensure the high-quality, efficient healthcare, better treatment and quality of life by analyzing biomedical and healthcare data including patient’s data, electronic health records (EHRs) and lifestyle. Previously, it was a common requirement to have a domain expert to develop a model for biomedical or healthcare; however, recent advancements in representation learning algorithms allows us to automatically to develop the model. Biomedical image mining, a novel research area, due to the vast amount of available biomedical images, increasingly generates and stores digitally. These images are mainly in the form of computed tomography (CT), X-ray, nuclear medicine imaging (PET, SPECT), magnetic resonance imaging (MRI) and ultrasound. Patients’ biomedical images can be digitized using data mining techniques and may help in answering several important and critical questions relating to healthcare. Image mining in medicine can help to uncover new relationships between data and reveal new useful information that can be helpful for doctors in treating their patients. Audience Researchers in various fields including computer science, medical informatics, healthcare IOT, artificial intelligence, machine learning, image processing, clinical big data analytics.

Predictive Modeling in Biomedical Data Mining and Analysis

Predictive Modeling in Biomedical Data Mining and Analysis
Author: Sudipta Roy
Publisher: Academic Press
Total Pages: 346
Release: 2022-08-28
Genre: Science
ISBN: 0323914454


Download Predictive Modeling in Biomedical Data Mining and Analysis Book in PDF, Epub and Kindle

Predictive Modeling in Biomedical Data Mining and Analysis presents major technical advancements and research findings in the field of machine learning in biomedical image and data analysis. The book examines recent technologies and studies in preclinical and clinical practice in computational intelligence. The authors present leading-edge research in the science of processing, analyzing and utilizing all aspects of advanced computational machine learning in biomedical image and data analysis. As the application of machine learning is spreading to a variety of biomedical problems, including automatic image segmentation, image classification, disease classification, fundamental biological processes, and treatments, this is an ideal reference. Machine Learning techniques are used as predictive models for many types of applications, including biomedical applications. These techniques have shown impressive results across a variety of domains in biomedical engineering research. Biology and medicine are data-rich disciplines, but the data are complex and often ill-understood, hence the need for new resources and information. Includes predictive modeling algorithms for both Supervised Learning and Unsupervised Learning for medical diagnosis, data summarization and pattern identification Offers complete coverage of predictive modeling in biomedical applications, including data visualization, information retrieval, data mining, image pre-processing and segmentation, mathematical models and deep neural networks Provides readers with leading-edge coverage of biomedical data processing, including high dimension data, data reduction, clinical decision-making, deep machine learning in large data sets, multimodal, multi-task, and transfer learning, as well as machine learning with Internet of Biomedical Things applications

Leveraging Structure and Knowledge in Clinical and Biomedical Representation Learning

Leveraging Structure and Knowledge in Clinical and Biomedical Representation Learning
Author: Matthew Brian Andrew McDermott
Publisher:
Total Pages: 0
Release: 2022
Genre:
ISBN:


Download Leveraging Structure and Knowledge in Clinical and Biomedical Representation Learning Book in PDF, Epub and Kindle

Datasets in the machine learning for health and biomedicine domain are often noisy, irregularly sampled, only sparsely labeled, and small relative to the dimensionality of the both the data and the tasks. These problems motivate the use of representation learning in this domain, which encompasses a variety of techniques designed to produce representations of a dataset that are amenable to downstream modelling tasks. Representation learning in this domain can also take advantage of the significant external knowledge in the biomedical domain. In this thesis, I will explore novel pre-training and representation learning strategies for biomedical data which leverage external structure or knowledge to inform learning at both local and global scales. These techniques will be explored in 4 chapters: (1) leveraging unlabeled data to infer distributional constraints in a semi-supervised learning setting; (2) using graph convolutional neural networks over gene-gene co-regulatory networks to improve modelling of gene expression data; (3) adapting pre-training techniques from natural language processing to electronic health record data, and showing that novel methods are needed for electronic health record timeseries data; and (4) asserting global structure in pre-training applications through structure-inducing pre-training.

Signal Processing and Machine Learning for Biomedical Big Data

Signal Processing and Machine Learning for Biomedical Big Data
Author: Ervin Sejdic
Publisher: CRC Press
Total Pages: 1235
Release: 2018-07-04
Genre: Medical
ISBN: 1351061216


Download Signal Processing and Machine Learning for Biomedical Big Data Book in PDF, Epub and Kindle

Within the healthcare domain, big data is defined as any ``high volume, high diversity biological, clinical, environmental, and lifestyle information collected from single individuals to large cohorts, in relation to their health and wellness status, at one or several time points.'' Such data is crucial because within it lies vast amounts of invaluable information that could potentially change a patient's life, opening doors to alternate therapies, drugs, and diagnostic tools. Signal Processing and Machine Learning for Biomedical Big Data thus discusses modalities; the numerous ways in which this data is captured via sensors; and various sample rates and dimensionalities. Capturing, analyzing, storing, and visualizing such massive data has required new shifts in signal processing paradigms and new ways of combining signal processing with machine learning tools. This book covers several of these aspects in two ways: firstly, through theoretical signal processing chapters where tools aimed at big data (be it biomedical or otherwise) are described; and, secondly, through application-driven chapters focusing on existing applications of signal processing and machine learning for big biomedical data. This text aimed at the curious researcher working in the field, as well as undergraduate and graduate students eager to learn how signal processing can help with big data analysis. It is the hope of Drs. Sejdic and Falk that this book will bring together signal processing and machine learning researchers to unlock existing bottlenecks within the healthcare field, thereby improving patient quality-of-life. Provides an overview of recent state-of-the-art signal processing and machine learning algorithms for biomedical big data, including applications in the neuroimaging, cardiac, retinal, genomic, sleep, patient outcome prediction, critical care, and rehabilitation domains. Provides contributed chapters from world leaders in the fields of big data and signal processing, covering topics such as data quality, data compression, statistical and graph signal processing techniques, and deep learning and their applications within the biomedical sphere. This book’s material covers how expert domain knowledge can be used to advance signal processing and machine learning for biomedical big data applications.

Machine Learning for Biomedical Applications

Machine Learning for Biomedical Applications
Author: Maria Deprez
Publisher: Academic Press
Total Pages: 306
Release: 2023-09-12
Genre: Computers
ISBN: 0128229055


Download Machine Learning for Biomedical Applications Book in PDF, Epub and Kindle

Machine Learning for Biomedical Applications: With Scikit-Learn and PyTorch presents machine learning techniques most commonly used in a biomedical setting. Avoiding a theoretical perspective, it provides a practical and interactive way of learning where concepts are presented in short descriptions followed by simple examples using biomedical data. Interactive Python notebooks are provided with each chapter to complement the text and aid understanding. Sections cover uses in biomedical applications, practical Python coding skills, mathematical tools that underpin the field, core machine learning methods, deep learning concepts with examples in Keras, and much more. This accessible and interactive introduction to machine learning and data analysis skills is suitable for undergraduates and postgraduates in biomedical engineering, computer science, the biomedical sciences and clinicians. Gives a basic understanding of the most fundamental concepts within machine learning and their role in biomedical data analysis Shows how to apply a range of commonly used machine learning and deep learning techniques in biomedical problems Develops practical computational skills that are needed to manipulate complex biomedical data sets Shows how to design machine learning experiments that address specific problems related to biomedical data