Professor Rachel Harrison
Professor in Computer Science
School of Engineering, Computing and Mathematics
Role
Rachel Harrison is professor of computer science at Oxford Brookes University, UK.
Her research interests include software metrics, machine learning, and requirements engineering. Rachel is well known for her work on empirical and automated software engineering. She has over 160 publications (with over 4,400 citations on Google Scholar) and has consulted widely with industry, working with organizations such as IBM, Philips Research Labs, Praxis Critical Systems and The Open Group.
Rachel has served on over 50 international program committees, including ICSE, Promise, ESEM and EASE, and was founder and PC Chair or Co-Chair of both the RAISE workshops at IEEE ICSE and the AIRE workshops at IEEE RE. She is a member of the BCS, IEEE and ACM and is Editor-in-Chief of the Software Quality Journal, published by Springer.
Teaching and supervision
Courses
- Artificial Intelligence (BSc (Hons), MSci)
- Advanced Computer Science (MSc, PGDip, PGCert)
- Artificial Intelligence (MSc, PGDip, PGCert)
- Computer Science (BSc (Hons))
Modules taught
Rached is Module Leader for :
- Essential Maths for University study
- Study Skills
- Advanced Software Development
Research
Rachel is Director of DSERC (the Dependable System Engineering Centre), and was recently the Program Chair of AIRE, the Annual IEEE Workshop on AI and requirements engineering.
Centres and institutes
- Artificial Intelligence, Data Analysis and Systems (AIDAS) Institute
- Dependable Systems Engineering Research Centre (DSERC)
Groups
Projects
- AI apps for the mining of big data (AIMi)
- Automated review classification (ReClass)
- Multi-Criteria Decision Support using AI (MuD)
- Software Quality Improvement (SEQUIN)
- Spectra-based fault localisation (Spectra)
Publications
Journal articles
-
Bencomo N, Guo JC, Harrison R, Heyn HM, Menzies T, 'The Secret to Better AI and Better Software (is Requirements Engineering)'
IEEE Software 39 (1) (2021) pp.105-110
ISSN: 0740-7459 eISSN: 1937-4194AbstractPublished hereMuch has been written about the algorithmic role that AI plays for automation in SE.
But what about the role of AI, augmented by human knowledge? Can we make a profound
advance by combining human and artificial intelligence? Researchers in requirements
engineering think so, arguing that requirement engineering is the secret weapon for better AI
and better software. -
Leitão-Júnior PS, de Freitas DM, Vergilio SR, Camilo-Junior CG, Harrison R, 'Search-based Fault Localisation: A Systematic Mapping Study'
Information and Software Technology 123 (2020)
ISSN: 0950-5849AbstractPublished here Open Access on RADARContext. Software Fault Localisation (FL) refers to finding faulty software elements related to failures produced as a result of test case execution. This is a laborious and time consuming task. To allow FL automation search-based algorithms have been successfully applied in the field of Search-Based Fault Localisation (SBFL). However, there is no study mapping the SBFL field to the best of our knowledge and we believe that such a map is important to promote new advances in this field. Objective. To present the results of a mapping study on SBFL, by characterising the proposed methods, identifying sources of used information, adopted evaluation functions, applied algorithms and elements regarding reported experiments. Method. Our mapping followed a defined process and a search protocol. The conducted analysis considers different dimensions and categories related to the main characteristics of SBFL methods. Results. All methods are grounded on the coverage spectra category. Overall the methods search for solutions related to suspiciousness formulae to identify possible faulty code elements. Most studies use evolutionary algorithms, mainly Genetic Programming, by using a single-objective function. There is little investigation of real-and-multiple-fault scenarios, and the subjects are mostly written in C and Java. No consensus was observed on how to apply the evaluation metrics. Conclusions. Search-based fault localisation has seen a rise in interest in the past few years and the number of studies has been growing. We identified some research opportunities such as exploring new sources of fault data, exploring multi-objective algorithms, analysing benchmarks according to some classes of faults, as well as, the use of a unique definition for evaluation measures.
-
Martin C, Aldea A, Duce D, Harrison R, Alshaigy B, 'The Role of Usability Engineering in the Development of an Intelligent Decision Support System'
Lecture Notes in Computer Science 11326 (2019) pp.142-161
ISSN: 0302-9743 eISSN: 0302-9743 ISBN: 9783030127381AbstractPublished here Open Access on RADARThis paper presents an overview of the usability engineering process for the development of a personalised clinical decision support system for the management of type 1 diabetes. The tool uses artificial intelligence (AI) techniques to provide insulin bolus dose advice and carbohydrate recommendations that adapt to the individual. We describe the role of human factors and user-centred design in the creation of medical systems that must adhere to international standards. We focus specifically on the formative evaluation stage of this process. The preliminary analysis of data shows promising results.
-
Brown D, Aldea A, Harrison R, Martin C, Bayley I, 'Temporal case-based reasoning for type 1 diabetes mellitus bolus insulin decision support'
Artificial Intelligence in Medicine 85 (April 2018) (2018) pp.28-42
ISSN: 0933-3657 eISSN: 1873-2860AbstractIndividuals with type 1 diabetes have to monitor their blood glucose levels, determine the quantity of insulin required to achieve optimal glycaemic control and administer it themselves subcutaneously, multiple times per day. To help with this process bolus calculators have been developed that suggest the appropriate dose. However these calculators do not automatically adapt to the specific circumstances of an individual and require fine-tuning of parameters, a process that often requires the input of an expert.Published here Open Access on RADAR
To overcome the limitations of the traditional methods this paper proposes the use of an artificial intelligence technique, case-based reasoning, to personalise the bolus calculation. A novel aspect of our approach is the use of temporal sequences to take into account preceding events when recommending the bolus insulin doses rather than looking at events in isolation.
The in silico results described in this paper show that given the initial conditions of the patient, the temporal retrieval algorithm identifies the most suitable case for reuse. Additionally through insulin-on-board adaptation and postprandial revision, the approach is able to learn and improve bolus predictions, reducing the blood glucose risk index by up to 27% after three revisions of a bolus solution. -
Waite MA, Martin CE, Franklin R, Duce D, Harrison R, 'Human factors and data logging processes with the use of advanced technology for adults with type 1 diabetes (T1DM): A systematic integrative review'
Journal of Medical Internet Research 5 (1) (2017)
ISSN: 1439-4456 eISSN: 1438-8871AbstractPublished here Open Access on RADARBackground: People with T1DM are confronted with self-management tasks and for which they need to develop strategies to balance the risks of long-term complications with those of hypoglycemic events. The potential of advanced and evolving technology to address these issues involves consideration of psychological and behavioral constructs alongside evaluation of the usability of devices. Access and uptake of advanced technology is further influenced by economic factors and health care provider capacity to support such interventions. Previous reviews have either focused upon clinical outcomes or descriptively scoped the literature. In addition, some have synthesized studies on adults with those on children and young people where human factors are different. Objective: The objective of this review was to describe the relationship between
human factors and adherence with technology for data logging processes in adults with T1DM and to explore the factors which influence this association. Methods: A systematic search of the literature was undertaken in accordance with the PRISMA guidelines. Quality appraisal of each study was undertaken. Data were abstracted and categorized into the themes that underpinned the human factor constructs that were examined. Results: Eighteen studies were included in the review. Six constructs emerged from the data analysis: The relationship between adherence to data logging and measurable outcomes; Satisfaction with the transition to advanced technology for self-management; Use of advanced technology and time spent on diabetes related activities; Strategies to mediate the complexities of diabetes and the use of advanced technology; Cognition in the wild and, Meanings, views and perspectives from the users of technology. Conclusions: evidence of increased treatment satisfaction was found on transition from traditional to advanced technology use (insulin pump and continuous glucose monitoring (CGM)); the most significant contributing factor was when blood glucose (BG) levels were consistently
evidence that logging of data was positively correlated with increasing age when using an app that provided meaningful feedback (regression coefficient = 55.8 recordings/ year; P = 0.009). Furthermore, there were benefits of CGM for older people in mediating complexities and their fears of hypoglycemia with reported significant differences in well-being (P= .009). Qualitative studies within the review aimed to explore the use and uptake of technology within the context of everyday lives. There were ‘frustrations’ with CGM, continuous subcutaneous insulin infusion (CSII), calibration of devices and alarms. This created implications for “body image” and the way in which “significant others” impacted on the behavior and attitude of the individual towards technology use. There were wide variations in the normal use of and interaction with technology across a continuum of sociocultural contexts, which has implications for the way in which future technologies should be designed. Many of the quantitative studies in the review were limited by small sample sizes. This may make it difficult to generalize findings to other contexts. This is further limited by a sample that was predominantly Caucasian, well-controlled and engaged with their self-care. However, the use of critical appraisal frameworks has highlighted areas where research into human factors and data logging processes of individuals could be improved. This includes engaging people in the design of the technology especially hard-to- reach or marginalized groups.
-
Hernández-González J, Rodriguez D, Inza I, Harrison R, Lozano JA, 'Learning to classify software defects from crowds: A novel approach'
Applied Soft Computing 62 (January 2018) (2017) pp.579-591
ISSN: 1568-4946AbstractIn software engineering, associating each reported defect with a category allows, among many other things, for the appropriate allocation of resources. Although this classification task can be automated using standard machine learning techniques, the categorization of defects for model training requires expert knowledge, which is not always available. To circumvent this dependency, we propose to apply the learning from crowds paradigm, where training categories are obtained from multiple non-expert annotators (and so may be incomplete, noisy or erroneous) and, dealing with this subjective class information, classifiers are efficiently learnt. To illustrate our proposal, we present two real applications of the IBM’s orthogonal defect classification working on the issue tracking systems from two different real domains. Bayesian network classifiers learnt using two state-of-the-art methodologies from data labeled by a crowd of annotators are used to predict the category (impact) of reported software defects. The considered methodologies show enhanced performance regarding the straightforward solution (majority voting) according to different metrics. This shows the possibilities of using non-expert knowledge aggregation techniques when expert knowledge is unavailable.Published here Open Access on RADAR -
Harrison R, 'In this issue'
Software Quality Journal 24 (4) (2016) pp.877-878
ISSN: 0963-9314 eISSN: 1573-1367Published here -
Harrison R, Rodriguez D, Ruiz R, Riquelme JC, 'A study of subgroup discovery approaches for defect prediction'
Information and Software Technology 55 (10) (2013) pp.1810-1822
ISSN: 0950-5849AbstractPublished hereContext
Although many papers have been published on software defect prediction techniques, machine learning approaches have yet to be fully explored.
Objective
In this paper we suggest using a descriptive approach for defect prediction rather than the precise classification techniques that are usually adopted. This allows us to characterise defective modules with simple rules that can easily be applied by practitioners and deliver a practical (or engineering) approach rather than a highly accurate result.
Method
We describe two well-known subgroup discovery algorithms, the SD algorithm and the CN2-SD algorithm to obtain rules that identify defect prone modules. The empirical work is performed with publicly available datasets from the Promise repository and object-oriented metrics from an Eclipse repository related to defect prediction. Subgroup discovery algorithms mitigate against characteristics of datasets that hinder the applicability of classification algorithms and so remove the need for preprocessing techniques.
Results
The results show that the generated rules can be used to guide testing effort in order to improve the quality of software development projects. Such rules can indicate metrics, their threshold values and relationships between metrics of defective modules.
Conclusions
The induced rules are simple to use and easy to understand as they provide a description rather than a complete classification of the whole dataset. Thus this paper represents an engineering approach to defect prediction, i.e., an approach which is useful in practice, easily understandable and can be applied by practitioners.
-
Harrison R, Duce D, 'Usability of mobile applications: literature review and rationale for a new usability model'
Journal of Interaction Science 1 (1) (2013) pp.2-16
ISSN: 2194-0827AbstractThe usefulness of mobile devices has increased greatly in recent years allowing users to perform more tasks in amobile context. This increase in usefulness has come at the expense of the usability of these devices in somecontexts. We conducted a small review of mobile usability models and found that usability is usually measured interms of three attributes; effectiveness, efficiency and satisfaction. Other attributes, such as cognitive load, tend tobe overlooked in the usability models that are most prominent despite their likely impact on the success or failureof an application. To remedy this we introduces the PACMAD (People At the Centre of Mobile ApplicationDevelopment) usability model which was designed to address the limitations of existing usability models whenapplied to mobile devices. PACMAD brings together significant attributes from different usability models inorder to create a more comprehensive model. None of the attributes that it includes are new, but the existingprominent usability models ignore one or more of them. This could lead to an incomplete usability evaluation.We performed a literature search to compile a collection of studies that evaluate mobile applications and then evaluated the studies using our model.Published here -
Harrison R, 'Empirical findings on team size and productivity in software development'
Journal of Systems and Software 85 (3) (2012) pp.562-570
ISSN: 0164-1212AbstractThe size of software project teams has been considered to be a driver of project productivity. Although there is a large literature on this, new publicly available software repositories allow us to empirically perform further research. In this paper we analyse the relationships between productivity, team size and other project variables using the International Software Benchmarking Standards Group (ISBSG) repository. To do so, we apply statistical approaches to a preprocessed subset of the ISBSG repository to facilitate the study. The results show some expected correlations between productivity, effort and time as well as corroborating some other beliefs concerning team size and productivity. In addition, this study concludes that in order to apply statistical or data mining techniques to these type of repositories extensive preprocessing of the data needs to be performed due to ambiguities, wrongly recorded values, missing values, unbalanced datasets, etc. Such preprocessing is a difficult and error prone activity that would need further guidance and information that is not always provided in the repository.Published here -
Flood D, Harrison R, Iacob C, Duce D, 'Evaluating Mobile Applications: a spreadsheet case study'
International Journal of Mobile Human Computer Interaction 4 (4) (2012) pp.37-65
ISSN: 1942-390XAbstractThe power of mobile devices has increased dramatically in the last few years. These devices are becomingmore sophisticated and allow users to accomplish a wide variety of tasks while on the move. The ease withwhich mobile apps can be created and distributed has resulted in a number of usability issues becoming moreprevalent. This paper describes the range of usability issues encountered at all stages of the mobile app lifecycle, from when users begin to search for an app to when they finally remove the app from their device.Using these results the authors developed a number of guidelines for both app developers and app platformdevelopers that will improve the overall usability of mobile apps.Published here -
Boness K, Finkelstein A, Harrison R, 'A method for assessing confidence in requirements analysis'
Information and Software Technology 53 (10) (2011) pp.1084-1096
ISSN: 0950-5849AbstractContext: During development managers, analysts and designers often need to know whether enough requirements analysis work has been done and whether or not it is safe to proceed to the design stage. Objective: This paper describes a new, simple and practical method for assessing our confidence in a set of requirements. Method: We identified four confidence factors and used a goal oriented framework with a simple ordinal scale to develop a method for assessing confidence. We illustrate the method and show how it has been applied to a real systems development project. Results: We show how assessing confidence in the requirements could have revealed problems in this project earlier and so saved both time and money. Conclusion: Our meta-level assessment of requirements provides a practical and pragmatic method that can prove useful to managers, analysts and designers who need to know when sufficient requirements analysis has been performed.Published here -
Harrison R, 'In This Issue'
Software Quality Journal 19 (2011) pp.487-488
ISSN: 0963-9314 eISSN: 1573-1367Published here -
Boness K, Harrison R, 'Goal sketching from a concise business case'
International Journal on Advances in Software 3 (1-2) (2010) pp.90-99
ISSN: 1942-2628AbstractThis paper describes how the business case can be characterized and used to quickly make an initial and structurally complete goal-responsibility model. This eases the task of bringing disciplined support to key decision makers in a development project in such a way that it can be instantiated quickly and thereafter support all key decisions. This process also greatly improves the understanding shared by the key decision makers and helps to identify and manage loadbearing assumptions. Recent research has revealed two interesting issues, which are highlighted in this paper.Published here -
Boness K, Harrison R, Liu K, 'Goal sketching: an agile approach to clarifying requirements'
International Journal on Advances in Software 1 (1) (2009) pp.1-13
ISSN: 1942-2628AbstractThis paper describes a technique that can be used as part of a simple and practical agile method for requirements engineering. It is based on disciplined goal-responsibility modelling but eschews formality in favour of a set of practicality objectives. The technique can be used together with Agile Programming to develop software in internet time. We illustrate the technique and introduce lazy refinement, responsibility composition and context sketching. Goal sketching has been used in a number of real-world development projects, one of which is described here.Published here -
Boness K, Finkelstein A, Harrison R, 'A lightweight technique for assessing risks in requirements analysis'
IET Software 2 (1) (2008) pp.46-57
ISSN: 1751-8806AbstractA simple and practical technique for assessing the risks, that is, the potential for error, and consequent loss, in software system development, acquired during a requirements engineering phase is described. The technique uses a goal-based requirements analysis as a framework to identify and rate a set of key issues in order to arrive at estimates of the feasibility and adequacy of the requirements. The technique is illustrated and how it has been applied to a real systems development project is shown. How problems in this project could have been identified earlier is shown, thereby avoiding costly additional work and unhappy users.Published here -
Bartsch M, Harrison R, 'An exploratory study of the effect of aspect-oriented programming on maintainability'
Software Quality Journal 16 (1) (2008) pp.23-44
ISSN: 0963-9314AbstractIn this paper we describe an exploratory assessment of the effect of aspect-oriented programming on software maintainability. An experiment was conducted in which 11 software professionals were asked to carry out maintenance tasks on one of two programs. The first program was written in Java and the second in AspectJ. Both programs implement a shopping system according to the same set of requirements. A number of statistical hypotheses were tested. The results did seem to suggest a slight advantage for the subjects using the object-oriented system since in general it took the subjects less time to answer the questions on this system. Also, both systems appeared to be equally difficult to modify. However, the results did not show a statistically significant influence of aspect-oriented programming at the 5% level. We are aware that the results of this single small study cannot be generalized. We conclude that more empirical research is necessary in this area to identify the benefits of aspect-oriented programming and we hope that this paper will encourage such research.Published here
Book chapters
-
Harrison R, Veerappa V, 'Social Media Collaboration in Software Projects' in Ruhe G, Wohlin C (ed.), Software Project Management in a Changing World, Springer-Verlag Berlin Heidelberg (2014)
ISBN: 978-3-642-55034-8 eISBN: 978-3-642-55035-5AbstractSocial media has had a big impact on the way that software projects are managed and the way that stakeholders interact with each other: indeed, the nature of software projects has evolved substantially in keeping with the evolution of technology. A direct consequence of the ubiquity of the Internet is the increasing trend toward cooperation outside the boundaries of an office. The interactions involved in software projects have changed accordingly and can be broadly divided into two types: (1) interactions among stakeholders who are in a single location (e.g., people sharing the same office space) and (2) interactions among stakeholders who are in distributed locations (e.g., software projects that are partly implemented offshore). Social media has been and remains a significant facilitator to these kinds of interactions. This chapter looks at the implications of the use of social media software projects in today’s changing world.Published here Open Access on RADAR -
Martin C, Flood D, Harrison R, 'A protocol for evaluating mobile applications' in Information systems research and exploring social artifacts: approaches and methodologies, IGI Global (2013)
eISBN: 1.466624914E9AbstractThe number of applications available for mobile phones is growing at a rate which makes it difficult for new application developers to establish the current state of the art before embarking on new product development. This chapter outlines a protocol for capturing a snapshot of the present state of applications in existence for a given field in terms of both usability and functionality. The proposed methodology is versatile in the sense that it can be implemented for any domain across all mobile platforms, which is illustrated here by its application to two dissimilar domains on three platforms. The chapter concludes with a critical evaluation of the process that was undertaken.Published here -
Rodriguez D, Ruiz R, Riquelme JC, Harrison R, 'Subgroup discovery for defect prediction' in Search based software engineering, Springer (2011)
ISBN: 9783642237157AbstractAlthough there is extensive literature in software defect prediction techniques, machine learning approaches have yet to be fully explored and in particular, Subgroup Discovery (SD) techniques. SD algorithms aim to find subgroups of data that are statistically different given a property of interest [1,2]. SD lies between predictive (finding rules given historical data and a property of interest) and descriptive tasks (discovering interesting patterns in data). An important difference with classification tasks is that the SD algorithms only focus on finding subgroups (e.g., inducing rules) for the property of interest and do not necessarily describe all instances in the dataset.Published here
Conference papers
-
Duce, D. and Martin, C. and Russell, a. and Brown D. and Aldea, A. and Alshaigy, B. and Harrison, R. and Waite, M. and Leal, Y. and Wos, M.and Fernandez-Balsells, M. and Real, J. and Nita, L. and López, B. and Massana, J. and Avari, P. and Herrero, P. and Jugnee, N. and Oliver, N. and Reddy, M., 'Visualizing Usage Data from a Diabetes Management System'
(2020)
ISBN: 9783038681229AbstractPublished here Open Access on RADARThis article explores the role for visualization in interpreting data collected by a customised analytics framework within a healthcare technology project. It draws on the work of the EU-funded PEPPER project, which has created a personalised decision-support system for people with type 1 diabetes. Our approach was an exercise in exploratory visualization, as described by Bergeron's three category taxonomy. The charts revealed different patterns of interaction, including variability in insulin dosing schedule, and potential causes of rejected advice. These insights into user behaviour are of especial value to this field, as they may help clinicians and developers understand some of the obstacles that hinder the uptake of diabetes technology.
-
Zhu H, Liu D, Bayley I, Harrison R, Cuzzolin F, 'Datamorphic Testing: A Method for Testing Intelligent Applications'
(2019) pp.149-156
ISBN: 9781728104935 eISBN: 9781728104928AbstractPublished here Open Access on RADARAdequate testing of AI applications is essential to ensure their quality. However, it is often prohibitively difficult to generate realistic test cases or to check software correctness. This paper proposes a new method called datamorphic testing, which consists of three components: a set of seed test cases, a set of datamorphisms for transforming test cases, and a set of metamorphisms for checking test results. With an example of face recognition application, the paper demonstrates how to develop datamorphic test frameworks, and illustrates how to perform testing in various strategies, and validates the approach using an experiment with four real industrial applications of face recognition.
-
de-Freitas D, Leitao-Junior P, Camilo-Junior C, Harrison R, 'Mutation-based Evolutionary Fault Localisation'
(2018) pp.2291-2298
ISBN: 9781509060177AbstractFault localisation is an expensive and timeconsuming stage of software maintenance. Research is continuing to develop new techniques to automate the process of reducing the effort needed for fault localisation without losing quality. For instance, spectrum-based techniques use execution information from testing to formulate measures for ranking a list of suspicious code locations at which the program may be defective: the suspiciousness formulae mainly combine variables related to code coverage and test results (pass or fail). Moreover previous research has evaluated mutation analysis data (mutation spectra) instead of coverage traces, to yield promising results. This paper reports on a Genetic Programming (GP) solution for the fault localisation problem together with a set of experiments to evaluate the GP solution with respect to baselines and benchmarks. The innovative aspects are is the joint investigation of: (i) specialisation of suspiciousness formulae for certain contexts; (ii) the application of mutation spectra to GP-evolved formulae, i.e. signals other than program coverage; (iii) a comparison of the effectiveness of coverage spectra and mutation spectra in the context of evolutionary approaches; and (iv) an analysis of the mutation spectra quality. The results show the competitiveness of GP-evolved mutation spectra heuristics over coverage traces as well as over a number of baselines, and suggest that the quality of mutation-related variables increases the effectiveness of faultPublished here Open Access on RADAR
localisation heuristics. -
Martin C, Aldea A, Duce D, Harrison R, Waite M, 'The Role of Usability Engineering in the Development of an Intelligent Decision Support System'
(2018)
AbstractPublished hereWe describe the role of human factors in the development of a personalised clinical decision support system for type 1 diabetes self-management. The tool uses artificial intelligence (AI) techniques to provide insulin bolus dose advice and carbohydrate recommendations that adapt to the individual.
-
de-Freitas DM, Leitao-Junior PS, Camilo-Junior CG, Harrison R, 'Evolutionary Composition of Customised Fault Localisation Heuristics'
(2018)
AbstractFault localisation is one of the most difficult and costly parts in software debugging. Researchers have tried to automate this process by formulating measures for assessment of code elements’ suspiciousness. This paper reports an evolutionary-based approach to combine non-linearly 34Open Access on RADAR
previous measures to formulate a new program oriented fault localisation heuristic. The method was evaluated with 107 single-bug programs and compared against 35 approaches – 34 spectrum-based heuristics and a previous evolutionary linear combination approach. The experiments have shown that the proposal consistently achieved competitive results compared to the others according to several effectiveness metrics. -
Deocadez R, Harrison R, Rodriguez, D, 'Automatically Classifying Requirements from App Stores: A Preliminary Study'
(2017)
Published here Open Access on RADAR -
Deocadez R, Harrison R, Rodriguez D, 'Preliminary Study on Applying Semi-Supervised Learning to App Store Analysis'
(2017)
ISBN: 9781450348041AbstractSemi-Supervised learning (SSL) is a data mining technique which comes in between supervised and unsupervised techniques, and is useful when a small number of instances in a dataset are labeled but a lot of unlabeled data is also available. This is the case with user reviews in application stores such as the Apple AppStore or Google Play, where very many reviews are available but classifying them into categories such as bug related review or feature request is expensive or at least labor intensive. SSL techniques are well-suited to this problem as classifying reviews not only takes time and eort, but may also be unnecessary. In this work, we analyse SSL techniques to show their viability and their capabilities in a dataset of reviews collected from the AppStore for both transductive (predicting existing instance labels during training) and inductive (predicting labels on unseen future data) performance. -
Ibarguren-Arrieta I, Perez J, Muguerza J, Rodriguez D, Harrison R, 'The Consolidated Tree Construction Algorithm in Imbalanced Defect Prediction Datasets'
(2017)
ISBN: 9781509046010AbstractIn this short paper, we compare well-known rule/tree classifiers in software defect prediction with the CTC decision tree classifier designed to deal with class imbalance. It is well-known that most software defect prediction datasets are highly imbalance (non-defective instances outnumber defective ones). In this work, we focused only on tree/rule classifiers as these are capable of explaining the decision, i.e., describing the metrics and thresholds that make a module error prone. Furthermore, rules/decision trees provide the advantage that they are easily understood and applied by project managers and quality assurance personnel. The CTC algorithm was designed to cope with class imbalance and noise datasets instead of using preprocessing techniques (oversampling or undersampling), ensembles or cost weights of misclassification. The experimental work was carried out using the NASA datasets and results showed that induced CTC decision trees performed better or similar to the rest of the rule/tree classifiers.Published here Open Access on RADAR -
D. Brown, C. Martin, D. Duce, A. Aldea, R. Harrison, 'Towards a Formal Model of Type 1 Diabetes for Artificial Intelligence'
(2017)
AbstractArtificial Intelligence (AI) is potentially useful for cost effective diabetes self-management. One research priority for the development of robust and beneficial AI concerns the use of formal verification techniques to model such self-modifying systems. In the context of diabetes, formal methods may also have a role in fostering trust in the technology as well as facilitating dialogue between a multidisciplinary team to determine system requirements in a precise way. In this paper we show how the formal modelling language Event-B can be used to capture safety-critical constraints associated with AI systems for diabetes management.Published here Open Access on RADAR -
Iacob C, Faily S, Harrison R, 'MARAM: Tool Support for Mobile App Review Management'
(2016) pp.42-50
ISBN: 9781631901379AbstractMobile apps today have millions of user reviews available online. Such reviews cover a large broad of themes and are usually expressed in an informal language. They provide valuable information to developers, such as feature requests, bug reports, and detailed descriptions of one’s interaction with the app. Due to the overwhelmingly large number of reviews apps usually get associated with, managing and making sense of reviews is difficult. In this paper, we address this problem by introducing MARAM, a tool designed to providing support for managing and integrating online reviews with other software management tools available, such as GitHub, JIRA and Bugzilla. The tool is designed to a) automatically extract app development relevant information from online reviews, b) support developers’ queries on (subsets of) the user generated content available on app stores, namely online reviews, feature requests, and bugs, and c) support the management of online reviews and their integration with other software management tools available, namely GitHub, JIRA or Bugzilla.Published here -
Iacob C, Faily S, Harrison R, 'Mining for Mobile App Usability in Online Reviews'
(2016)
Published here -
Boness K, Harrison R, 'The synergies between goal sketching and enterprise architecture'
(15637152) (2015) pp.46-52
ISBN: 978-1-5090-0110-1AbstractPublished here Open Access on RADARThis paper introduces a pragmatic and practical method for requirements modeling. The method is built using the concepts of our goal sketching technique together with techniques from an enterprise architecture modeling language. Our claim is that our method will help project managers who want to establish early control of their projects and will also give managers confidence in the scope of their project. In particular we propose the inclusion of assumptions as first class entities in the ArchiMate enterprise architecture modeling language and an extension of the ArchiMate Motivation Model principle to allow radical as well as normative analyses. We demonstrate the usefulness of this method using a simple university library system as an example.
-
Rodriguez D, Herraiz, I, Harrison R, Dolado J, Riquelme J, 'Preliminary Comparison of Techniques for Dealing with Imbalance in Software Defect Prediction'
Empirical Software Engineering (2014)
ISSN: 1382-3256 eISSN: 1573-7616AbstractImbalanced data is a common problem in data mining when dealing with classification problems, where samples of a class vastly outnumber other classes. In this situation, many data mining algorithms generate poor models as they try to optimize the overall accuracy and perform badly in classes with very few samples. Software Engineering data in general and defect prediction datasets are not an exception and in this paper, we compare different approaches, namely sampling, cost-sensitive, ensemble and hybrid approaches to the problem of defect prediction with different datasets preprocessed differently. We have used the well-known NASA datasets curated by Shepperd et al. There are differences in the results depending on the characteristics of the dataset and the evaluation metrics, especially if duplicates and inconsistencies are removed as a preprocessing step.Published here -
Iacob C, Harrison R, Faily S, 'Online Reviews as First Class Artifacts in Mobile App Development'
130 (2014) pp.47-53
ISBN: 978-3-319-05452-0AbstractThis paper introduces a framework for developing mobile apps. The framework relies heavily on app stores and, particularly, on online reviews from app users. The underlying idea is that app stores are proxies for users because they contain direct feedback from them. Such feedback includes feature requests and bug reports, which facilitate design and testing respectively. The framework is supported by MARA, a prototype system designed to automatically extract relevant information from online reviews.Published here -
Rodriguez D, Herraiz I, Harrison R, Dolado J, Riquelme JC, 'Preliminary Comparison of Techniques for Dealing with Imbalance in Software Defect Prediction'
(43) (2014)
ISBN: 978-1-4503-2476-2AbstractImbalanced data is a common problem in data mining when dealing with classification problems, where samples of a class vastly outnumber other classes. In this situation, many data mining algorithms generate poor models as they try to optimize the overall accuracy and perform badly in classes with very few samples. Software Engineering data in general and defect prediction datasets are not an exception and in this paper, we compare different approaches, namely sampling, cost-sensitive, ensemble and hybrid approaches to the problem of defect prediction with different datasets preprocessed differently. We have used the well-known NASA datasets curated by Shepperd et al. There are differences in the results depending on the characteristics of the dataset and the evaluation metrics, especially if duplicates and inconsistencies are removed as a preprocessing step.Published here -
Veerappa V, Harrison R, 'An Empirical Validation of Coupling Metrics Using Automated Refactoring'
(2013) pp.271-274
ISBN: 978-0-7695-5056-5AbstractThe validation of software metrics has received much interest over the years due to a desire to promote metrics which are both well-founded theoretically and have also been shown empirically to reflect our intuition and be practically useful. In this paper we describe how we used automatic refactoring to investigate the changes which occur to a number of metrics as software evolves. We use the concept of software volatility to quantify our comparison of the metrics.Published here -
Veerappa V, Harrison R, 'Assessing the maturity of requirements through argumentation: A good enough approach'
(2013) pp.670-675
ISBN: 978-1-4799-0215-6AbstractRequirements engineers need to be confident that enough requirements analysis has been done before a project can move forward. In the context of KAOS, this information can be derived from the soundness of the refinements: sound refinements indicate that the requirements in the goal-graph are mature enough or good enough for implementation. We can estimate how close we are to `good enough' requirements using the judgments of experts and other data from the goals. We apply Toulmin's model of argumentation to evaluate how sound refinements are. We then implement the resulting argumentation model using Bayesian Belief Networks and provide a semi-automated way aided by Natural Language Processing techniques to carry out the proposed evaluation. We have performed an initial validation on our work using a small case-study involving an electronic document management system.Published here -
Brown D, Bayley I, Harrison R, Martin C, 'Developing a Mobile Case-based Reasoning Application to Assist Type 1 Diabetes Management'
(2013)
ISBN: 978-1-4673-5800-2AbstractEffective management of diabetes is crucial for patient wellbeing and the prevention of low blood sugar levels (Hypoglycemia) and high blood sugar levels (Hyperglycemia) both of which can be potentially dangerous. Traditionally log books are maintained by patients to record information such as insulin usage and their meals. The ever increasing popularity of smart phones has resulted in various applications being developed to allow patients to log data and help manage their condition. However these applications are often developed simply for the logging of data and only occasionally provide basic calculations to suggest insulin doses following a meal. The goal of this research is to use case-based reasoning techniques to suggest an insulin dosage for the patient as opposed to using a one calculation fits all approach. This is to be achieved by building a knowledge base of the patient's history that is then used to obtain a solution which best fits the current circumstances. The proposed case-based reasoning system is described alongside the development of the system to date and discussion into further research and development. The final implementation will be tested and validated using a diabetic patient simulator to create a knowledge base and observe system behavior and accuracy.Published here -
Harrison R, Rodriguez D, Ruiz M, Riquelme, J, 'Multiobjective Simulation Optimisation in Software Project Management'
(2013) pp.1883-1890
AbstractPublished hereTraditionally, simulation has been used by project managers in optimising decision making. However, current simulation packages only include simulation optimisation which considers a single objective (or multiple objectives combined into a single fitness function). This paper aims to describe an approach that consists of using multiobjective optimisation techniques via simulation in order to help software project managers find the best values for initial team size and schedule estimates for a given project so that cost, time and productivity are optimised. Using a System Dynamics (SD) simulation model of a software project, the sensitivity of the output variables regarding productivity, cost and schedule using different initial team size and schedule estimations is determined. The generated data is combined with a well-known multiobjective optimisation algorithm, NSGA-II, to find optimal solutions for the output variables. The NSGA-II algorithm was able to quickly converge to a set of optimal solutions composed of multiple and conflicting variables from a medium size software project simulation model. Multiobjective optimisation and SD simulation modeling are complementary techniques that can generate the Pareto front needed by project managers for decision making. Furthermore, visual representations of such solutions are intuitive and can help project managers in their decision making process.
-
Iacob C, Veerappa V, Harrison R, 'What Are You Complaining About?: A Study of Online Reviews of Mobile Applications'
(2013)
AbstractIn this paper, we explore the content of online reviews of mobile applications to get a better understanding of the most recurring issues users report through reviews, and the way the price and the rating of an app influences the type and the amount of feedback users report. Results show that users tend to provide positive feedback, often associating it with requirements for additional features. Also, users tend to provide more feedback for the lower rated apps and the optimal price range was found to be between £2.25 and £3.50.Published here -
Flood D, Germanakos P, Harrison R, McCaffery F, Samaras G, 'Estimating Cognitive Overload in Mobile Applications for Decision Support within the Medical Domain'
(2012) pp.103-107
ISBN: 978-989-8565-12-9AbstractMobile applications have the potential to improve the quality of care received by patients from their primary care physicians (PCP). They can allow doctors to access the information they need when and where they need it in order to make informed decisions regarding patients’ health. They can also allow patients to better control conditions such as Diabetes and Gaucher’s disease. However, there are a number of limitations to these devices, such as small screen sizes and limited processing power, which can produce cognitive overload which in turn can negatively impact upon the decision making processes. This paper introduces a new research direction which aims to predict, during the development of mobile health care applications, when cognitive overload is likely to occur. By identifying the user’s previous level of experience, their working memory, the complexity of the interface and the level of distraction imposed by the user’s context, a prediction can be made as to when cognitive o verload is likely to occurPublished here -
Brown D, Bayley I, Harrison R, Martin C, 'Formal specification of a mobile diabetes management application using the Rodin platform and Event-B'
(2012) pp.43-44
-
Flood D, Harrison R, Iacob C, 'Lessons Learned from Evaluating the Usability of Mobile Spreadsheet Applications'
7623 (2012) pp.315-322
ISBN: 978-3-642-34346-9 eISBN: 978-3-642-34347-6AbstractIt is estimated that 90% of all the analysts in business perform calculations on spreadsheets. Due to advances in technology, spreadsheet applications can now be used on mobile devices and several such applications are available for platforms such as Android and iOS. Research on spreadsheets revolves around several themes, but little work has been done in evaluating the usability of such applications (desktop or mobile). This paper presents lessons learned and usability guidelines derived from laboratory usability testing of mobile spreadsheet applications. Twelve participants were given a task to be solved using a mobile spreadsheet application and based on the video recordings of their interactions with the application patterns of recurring actions and sequences of actions were derived. Navigation, selection, feedback, and transparency of features were some of the main themes in the results of the testing, pointing to a set of guidelines which are also generalizable across other types of mobile applications.Published here -
Bayley I, Flood D, Harrison R, Martin C, 'Mobitest: A cross-platform tool for testing mobile applications'
(2012)
ISBN: 978-1-61208-230-1AbstractPublished hereTesting is an essential part of the software development lifecycle. However, it can cost a lot of time and money to perform. For mobile applications, this problem is further exacerbated by the need to develop apps in a short time-span and for multiple platforms. This paper proposes MobiTest, a cross-platform automated testing tool for mobile applications, which uses a domain-specific language for mobile interfaces. With it, developers can define a single suite of tests that can then be run for the same application on multiple platforms simultaneously, with considerable savings in time and money.
-
Solomon BS, Duce D, Harrison R, Boness K, 'Modeling social media collaborative work'
(12835900) (2012) pp.43-49
ISBN: 978-1-4673-1756-6AbstractThis paper proposes an approach for modeling Social Media Collaborative Work (SMCW). We consider Social Media Collaborative Work to consist of multi-stakeholder viewpoints and human activity linked together by social media. SMCW has great potential within complex multifaceted domains such as healthcare. In this paper we describe how to model SMCW in a way which shows the multi-stakeholder intentions, concerns and priorities. We are conducting empirical studies to develop our approach for modeling SMCW. In particular we are using action research with a self-help community to develop and validate our SMCW modeling approach. In our approach we make use of the soft systems methodology in combination with i* modeling and social psychology.Published here -
Rodriguez D, Herraiz I, Harrison R, 'On software engineering repositories and their open problems'
(12848745) (2012) pp.52-56
ISBN: 978-1-4673-1752-8AbstractIn the last decade, a large number of software repositories have been created for different purposes. In this paper we present a survey of the publicly available repositories and classify the most common ones as well as discussing the problems faced by researchers when applying machine learning or statistical techniques to them.Published here -
Herraiz I, Rodriguez D, Harrison R, 'On the statistical distribution of object-oriented system properties'
(12835962) (2012) pp.56-62
ISBN: 978-1-4673-1763-4AbstractThe statistical distributions of different software properties have been thoroughly studied in the past, including software size, complexity and the number of defects. In the case of object-oriented systems, these distributions have been found to obey a power law, a common statistical distribution also found in many other fields. However, we have found that for some statistical properties, the behavior does not entirely follow a power law, but a mixture between a lognormal and a power law distribution. Our study is based on the Qualitas Corpus, a large compendium of diverse Java-based software projects. We have measured the Chidamber and Kemerer metrics suite for every file of every Java project in the corpus. Our results show that the range of high values for the different metrics follows a power law distribution, whereas the rest of the range follows a lognormal distribution. This is a pattern typical of so-called double Pareto distributions, also found in empirical studies for other software properties.Published here -
Harrison R, 'Subgroup Discovery for Defect Prediction'
6956 (2012) pp.269-270
-
Martin C, Flood D, Sutton D, Aldea A, Harrison R, Waite M, 'A Systematic Evaluation of Mobile Applications for Diabetes Management'
6949 (2011) pp.466-469
ISBN: 978-3-642-23767-6 eISBN: 978-3-642-23768-3AbstractThis short paper contains a summary of work that is currently in progress towards the development of an intelligent, personalised tool for diabetes management. A preliminary part of the development process has consisted of a systematic evaluation of existing applications for mobile phones.Published here -
Flood D, Harrison R, Martin CE, McDaid K, 'A systematic evaluation of mobile spreadsheet apps'
(2011)
AbstractThe power and flexibility of spreadsheets have made them an essential part of modern business. The increasingly mobile nature of business has created a need to access spreadsheets while on the move. Mobile devices such as the Apple iPhone and Blackberry have enabled users to do this but the small nature of these devices has caused a number of issues for mobile spreadsheet users. This paper presents a systematic evaluation of mobile spreadsheet apps available on the iOS platform which not only includes an examination of the range of available features and functions but also examines the usability of these applications. This work also recommends some ways in which the usability of mobile spreadsheet apps can be improved. -
Rodriguez D, Ruiz M, Riquelme J, Harrison R, 'Multiobjective simulation optimization in software project management'
(2011) pp.1883-1890
ISBN: 978-1-4503-0557-0AbstractTraditionally, simulation has been used by project managers in optimising decision making. However, current simulation packages only include simulation optimisation which considers a single objective (or multiple objectives combined into a single fitness function). This paper aims to describe an approach that consists of using multiobjective optimisation techniques via simulation in order to help software project managers find the best values for initial team size and schedule estimates for a given project so that cost, time and productivity are optimised. Using a System Dynamics (SD) simulation model of a software project, the sensitivity of the output variables regarding productivity, cost and schedule using different initial team size and schedule estimations is determined. The generated data is combined with a well-known multiobjective optimisation algorithm, NSGA-II, to find optimal solutions for the output variables. The NSGA-II algorithm was able to quickly converge to a set of optimal solutions composed of multiple and conflicting variables from a medium size software project simulation model. Multiobjective optimisation and SD simulation modeling are complementary techniques that can generate the Pareto front needed by project managers for decision making. Furthermore, visual representations of such solutions are intuitive and can help project managers in their decision making process.Published here -
Rodriguez D, Ruiz M, Riquelme JC, Harrison R, 'Optimizacion multiobjetivo de la toma de decisiones en gestion de proyectos software basada en simulacion'
(2011)
AbstractLa simulacion se ha utilizado con frecuencia en los ultimos años como herramienta de ayuda a la optimizacion de la toma de decisiones en la gestion de proyectos software. Sin embargo, las herramientas actuales que permiten construir y simular los modelos solamente incluyen modulos que permiten la optimizacion de un unico objetivo. Esto no parece suficiente para un ambito como el de la gestion de proyectos software en el que frecuentemente hay que tomar decisiones que optimicen determinados resultados que, a menudo, entran en conflicto. En este trabajo se presenta un enfoque que consiste en la aplicacion de tecnicas de optimizacion multiobjetivo a los resultados obtenidos por el modelo de simulacion, con el objeto de permitir a los directores de proyectos software contar con una herramienta que les permita optimizar de manera mas efectiva su proceso de toma de decisiones. Para ilustrar la propuesta, se presenta su aplicacion en el ambito de la seleccion de los valores correspondientes al tamaño del equipo de desarrollo y a la estimacion temporal del proyecto de manera que se optimicen, al mismo tiempo, los indicadores de tiempo, coste y productividad en el proyectoPublished here -
Flood D, Harrison R, McDaid K, 'Spreadsheets on the Move: An Evaluation of Mobile Spreadsheets'
(2011)
ISBN: 978-0-9566256-9-4AbstractThe power of mobile devices has increased dramatically in the last few years. These devices are becoming more sophisticated allowing users to accomplish a wide variety of tasks while on the move. The increasingly mobile nature of business has meant that more users will need access to spreadsheets while away from their desktop and laptop computers. Existing mobile applications suffer from a number of usability issues that make using spreadsheets in this way more difficult. This work represents the first evaluation of mobile spreadsheet applications. Through a pilot survey the needs and experiences of experienced spreadsheet users was examined. The range of spreadsheet apps available for the iOS platform was also evaluated in light of these users’ needs.Published here -
Garcia E, Martin C, Garcia A, Harrison R, Flood D, 'Systematic analysis of mobile diabetes management applications on different platforms'
(2011) pp.379-396
ISBN: 9783642253638AbstractPublished hereThere are a number of mobile applications available to help patients suffering from Type 1 diabetes to manage their condition, but the quality of these applications varies greatly. This paper details the findings from a systematic analysis of these applications on three mobile platforms (Android, iOS, and Blackberry) that was conducted to establish the state of the art in mobile applications for diabetes management. The findings from this analysis will help to inform the future development of more effective mobile applications to help patients suffering from Type 1 diabetes who wish to manage their condition with a mobile application.
-
Flood D, Harrison R, Nosseir A, 'Useful but tedious: An evaluation of mobile spreadsheets'
(6) (2011) pp.1-7
AbstractThe processing power of mobile devices has greatly increased in recent years. This increased power has allowed for a greater range of applications to be deployed on this platform, providing constant access to information for end users. The limited size of these devices however, causes a number of usability issues which make these applications error prone and difficult to use. This paper presents current research being undertaken into the use of spreadsheets, an important end-user development environment, on a mobile device. The key errors observed during this work are presented here to highlight some of the problems with using spreadsheets in this way. We believe these problems may be due to the increased cognitive load placed on users who try to use spreadsheets in a mobile context.Published here -
Black S, Harrison R, Baldwin M, 'A survey of social media use in software systems development'
(2010)
ISBN: 978-1-60558-975-6AbstractIn this paper, we describe the preliminary results of a pilot survey conducted to collect information on social media use in global software systems development. We created an on-line survey for developers who are using social media to communicate at work and whose work falls within the domain of software systems development, including web applications. Our results show that social media can enable better communication through the software system development process. 91% of respondents said that using social media has improved their working life.Published here -
Boness K, Harrison R, 'Goal Sketching and the Business Case'
(2009) pp.203-209
ISBN: 978-0-7695-3777-1AbstractThis paper describes how the business case can be characterized and used to quickly make an initial and structurally complete goal-responsibility model. This eases the problem of bringing disciplined support to key decision makers in a development project in such a way that it can be instantiated quickly and thereafter support all key decision gateways. This process also greatly improves the understanding shared by the key decision makers and helps to identify and manage load-bearing assumptions.Published here -
Boness K, Harrison R, 'Goal Sketching with Activity Diagrams'
(10389734) (2008) pp.277-283
ISBN: 978-1-4244-3218-9AbstractGoal orientation is acknowledged as an important paradigm in requirements engineering. The structure of a goal-responsibility model provides opportunities for appraising the intention of a development. Creating a suitable model under agile constraints (time, incompleteness and catching up after an initial burst of creativity) can be challenging. Here we propose a marriage of UML activity diagrams with goal sketching in order to facilitate the production of goal-responsibility models under these constraints.Published here
Other publications
-
Brown D, Duce D, Franklin R, Harrison R, Martin C, Waite M, 'SWiFT Seeing the Wood From the Trees: helping people make sense of their health data', (2015)
-
Rodriguez D, Ruiz R, Riquelme JC, Harrison R, 'Subgroup Discovery in Defect Prediction', (2011)
AbstractSubgroup Discovery (SD) algorithms aim to find subgroups of data (represented by rules) that are statistically different given a property of interest [3] and do not describe all instances in the dataset. They usually describe the minority class (the interesting one).Published here
We deal with the problem of software defect prediction through SD identifying software modules with a high probability of being defective.