Large Language Models (LLMs) like Codex are powerful tools for performing code completion and code generation tasks as they are trained on billions of lines of code from publicly available sources. Moreover, these models are capable of generating code snippets from Natural Language (NL) descriptions by learning languages and programming practices from public GitHub repositories. Although LLMs promise an effortless NL-driven deployment of software applications, the security of the code they generate has not been extensively investigated nor documented. In this work, we present LLMSecEval, a dataset containing 150 NL prompts that can be leveraged for assessing the security performance of such models. Such prompts are NL descriptions of code snippets prone to various security vulnerabilities listed in MITRE’s Top 25 Common Weakness Enumeration (CWE) ranking. Each prompt in our dataset comes with a secure implementation example to facilitate comparative evaluations against code produced by LLMs. As a practical application, we show how LLMSecEval can be used for evaluating the security of snippets automatically generated from NL descriptions.
Regret, Delete, (Do Not) Repeat: An Analysis of Self-Cleaning Practices on Twitter After the Outbreak of the COVID-19 Pandemic
Díaz Ferreyra, Nicolás E., Shahi, Gautam Kishore, Tony, Catherine and 2 more authors
In Extended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems (CHI EA ’23) 2023
During the outbreak of the COVID-19 pandemic, many people shared their symptoms across Online Social Networks (OSNs) like Twitter, hoping for others’ advice or moral support. Prior studies have shown that those who disclose health-related information across OSNs often tend to regret it and delete their publications afterwards. Hence, deleted posts containing sensitive data can be seen as manifestations of online regrets. In this work, we present an analysis of deleted content on Twitter during the outbreak of the COVID-19 pandemic. For this, we collected more than 3.67 million tweets describing COVID-19 symptoms (e.g., fever, cough, and fatigue) posted between January and April 2020. We observed that around 24% of the tweets containing personal pronouns were deleted either by their authors or by the platform after one year. As a practical application of the resulting dataset, we explored its suitability for the automatic classification of regrettable content on Twitter.
Developers Need Protection, Too: Perspectives and Research Challenges for Privacy in Social Coding Platforms
Díaz Ferreyra, Nicolás E., Imine, Abdessamad, Vidoni, Melina and 1 more author
In 16th International Conference on Cooperative and Human Aspects of Software Engineering (CHASE 2023) 2023
Social Coding Platforms (SCPs) like GitHub have become central to modern software engineering thanks to their collaborative and version-control features. Like in mainstream Online Social Networks (OSNs) such as Facebook, users of SCPs are subjected to privacy attacks and threats given the high amounts of personal and project-related data available in their profiles and software repositories. However, unlike in OSNs, the privacy concerns and practices of SCP users have not been extensively explored nor documented in the current literature. In this work, we present the preliminary results of an online survey (N=105) addressing developers’ concerns and perceptions about privacy threats steaming from SCPs. Our results suggest that, although users express concern about social and organisational privacy threats, they often feel safe sharing personal and project-related information on these platforms. Moreover, attacks targeting the inference of sensitive attributes are considered more likely than those seeking to re-identify source-code contributors. Based on these findings, we propose a set of recommendations for future investigations addressing privacy and identity management in SCPs.
GitHub Considered Harmful? Analyzing Open-Source Projects for the Automatic Generation of Cryptographic API Call Sequences
Tony, Catherine, Díaz Ferreyra, Nicolás, and Scandariato, Riccardo
In 2022 IEEE 22nd International Conference on Software Quality, Reliability and Security Companion (QRS-C) 2022
Getting to know new people online to later meet them offline for neighbourhood help, carpooling, or online dating has never been as easy as nowadays by social media performing computer-mediated introductions (CMIs). Unfortunately, interacting with strangers poses high risks such as unfulfilled expectations, fraud, or assaults. People most often tolerate risks if they believe others are trustworthy. However, conducting an online trustworthiness assessment usually is a challenge. Online cues differ from offline ones and people are either lacking awareness for the assessment’s relevance or find it too complicated. On these grounds, this work aims to aid software engineers to develop CMI that supports users in their online trustworthiness assessment. We focus on trust-related software features and nudges to i) increase user awareness, ii) trigger the trustworthiness assessment and iii) enable the assessment online. For that reason, we extend feature models to provide software engineers the possibility to create and document software features or nudges for trustworthiness assessment. The extended feature models for trustworthiness assessments can serve as reusable catalogues for validating features in terms of their impact on the trustworthiness assessment and for configuring CMI software product lines. Moreover, this work provides an example of how the extended feature models can be applied to catfishing protection in online dating.
ENAGRAM: An App to Evaluate Preventative Nudges for Instagram
Díaz Ferreyra, Nicolás E., Ostendorf, Sina, Aïmeur, Esma and 2 more authors
In 2022 European Symposium on Usable Security (EuroUSEC 2022) 2022
Online self-disclosure is perhaps one of the last decade’s most studied communication processes, thanks to the introduction of Online Social Networks (OSNs) like Facebook. Self-disclosure research has contributed significantly to the design of preventative nudges seeking to support and guide users when revealing private information in OSNs. Still, assessing the effectiveness of these solutions is often challenging since changing or modifying the choice architecture of OSN platforms is practically unfeasible. In turn, the effectiveness of numerous nudging designs is supported primarily by self-reported data instead of actual behavioral information. OBJECTIVE: This work presents ENAGRAM, an app for evaluating preventative nudges, and reports the first results of an empirical study conducted with it. Such a study aims to showcase how the app (and the data collected with it) can be leveraged to assess the effectiveness of a particular nudging approach. METHOD: We used ENAGRAM as a vehicle to test a risk-based strategy for nudging the self-disclosure decisions of Instagram users. For this, we created two variations of the same nudge (i.e., with and without risk information) and tested it in a between-subjects experimental setting. Study participants (N=22) were recruited via Prolific and asked to use the app regularly for 7 days. An online survey was distributed at the end of the experiment to measure some privacy-related constructs. RESULTS: From the data collected with ENAGRAM, we observed lower (though non-significant) self-disclosure levels when applying risk-based interventions. The constructs measured with the survey were not significant either, except for participants’ External Information Privacy Concerns (EIPC). IMPLICATIONS: Our results suggest that (i) ENAGRAM is a suitable alternative for conducting longitudinal experiments in a privacy-friendly way, and (ii) it provides a flexible framework for the evaluation of a broad spectrum of nudging solutions.
Community Detection for Access-Control Decisions: Analysing the Role of Homophily and Information Diffusion in Online Social Networks
Díaz Ferreyra, Nicolás E., Hecking, Tobias, Aïmeur, Esma and 2 more authors
Access-Control Lists (ACLs) (a.k.a. “friend lists”) are one of the most important privacy features of Online Social Networks (OSNs) as they allow users to restrict the audience of their publications. Nevertheless, creating and maintaining custom ACLs can introduce a high cognitive burden on average OSNs users since it normally requires assessing the trustworthiness of a large number of contacts. In principle, community detection algorithms can be leveraged to support the generation of ACLs by mapping a set of examples (i.e. contacts labelled as “untrusted”) to the emerging communities inside the user’s ego-network. However, unlike users’ access-control preferences, traditional community-detection algorithms do not take the homophily characteristics of such communities into account (i.e. attributes shared among members). Consequently, this strategy may lead to inaccurate ACL configurations and privacy breaches under certain homophily scenarios. This work investigates the use of community-detection algorithms for the automatic generation of ACLs in OSNs. Particularly, it analyses the performance of the aforementioned approach under different homophily conditions through a simulation model. Furthermore, since private information may reach the scope of untrusted recipients through the re-sharing affordances of OSNs, information diffusion processes are also modelled and taken explicitly into account. Altogether, the removal of gatekeeper nodes is further explored as a strategy to counteract unwanted data dissemination.
SoK: Security of Microservice Applications: A Practitioners’ Perspective on Challenges and Best Practices
Billawa, Priyanka, Bambhore Tukaram, Anusha, Díaz Ferreyra, Nicolás E. and 3 more authors
In International Conference on Availability, Reliability and Security (ARES) 2022
Vul4J: A Dataset of Reproducible Java Vulnerabilities Geared Towards the Study of Program Repair Techniques
Bui, Quang-Cuong, Scandariato, Riccardo, and Díaz Ferreyra, Nicolás E.
In International Conference on Mining Software Repositories (MSR) 2022
Cybersecurity Discussions in Stack Overflow: A Developer-Centred Analysis of Engagement and Self-Disclosure Behaviour
Díaz Ferreyra, Nicolás E., Vidoni, Melina, Heisel, Maritta and 1 more author
Stack Overflow (SO) is a popular platform among developers seeking advice on various software-related topics, including privacy and security. As for many knowledge-sharing websites, the value of SO depends largely on users’ engagement, namely their willingness to answer, comment or post technical questions. Still, many of these questions (including cybersecurity-related ones) remain unanswered, putting the site’s relevance and reputation into question. Hence, it is important to understand users’ participation in privacy and security discussions to promote engagement and foster the exchange of such expertise. Objective: Based on prior findings on online social networks, this work elaborates on the interplay between users’ engagement and their privacy practices in SO. Particularly, it analyses developers’ self-disclosure behaviour regarding profile visibility and their involvement in discussions related to privacy and security. Method: We followed a mixed-methods approach by (i) analysing SO data from 1239 cybersecurity-tagged questions along with 7048 user profiles, and (ii) conducting an anonymous online survey (N=64). Results: About 33% of the questions we retrieved had no answer, whereas more than 50% had no accepted answer. We observed that "proactive" users tend to disclose significantly less information in their profiles than “reactive” and “unengaged” ones. However, no correlations were found between these engagement categories and privacy-related constructs such as Perceived Control or General Privacy Concerns. Implications: These findings contribute to (i) a better understanding of developers’ engagement towards privacy and security topics, and (ii) to shape strategies promoting the exchange of cybersecurity expertise in SO.
Conversational DevBots for Secure Programming: An Empirical Study on SKF Chatbot
Tony, Catherine, Balasubramanian, Mohana, Díaz Ferreyra, Nicolás E. and 1 more author
In Evaluation and Assessment in Software Engineering (EASE) 2022
Conversational agents or chatbots are widely investigated and used across different fields including healthcare, education, and marketing. Still, the development of chatbots for assisting secure coding practices is in its infancy. In this paper, we present the results of an empirical study on SKF chatbot, a software-development bot (DevBot) designed to answer queries about software security. To the best of our knowledge, SKF chatbot is one of the very few of its kind, thus a representative instance of conversational DevBots aiding secure software development. In this study, we collect and analyse empirical evidence on the effectiveness of SKF chatbot, while assessing the needs and expectations of its users (i.e., software developers). Furthermore, we explore the factors that may hinder the elaboration of more sophisticated conversational security DevBots and identify features for improving the efficiency of state-of-the-art solutions. All in all, our findings provide valuable insights pointing towards the design of more context-aware and personalized conversational DevBots for security engineering.
Preventative Nudges: Introducing Risk Cues for Supporting Online Self-Disclosure Decisions
Díaz Ferreyra, Nicolás E., Kroll, Tobias, Aïmeur, Esma and 2 more authors
Like in the real world, perceptions of risk can influence the behavior and decisions that people make in online platforms. Users of Social Network Sites (SNSs) like Facebook make continuous decisions about their privacy since these are spaces designed to share private information with large and diverse audiences. In particular, deciding whether or not to disclose such information will depend largely on each individual’s ability to assess the corresponding privacy risks. However, SNSs often lack awareness instruments that inform users about the consequences of unrestrained self-disclosure practices. Such an absence of risk information can lead to poor assessments and, consequently, undermine users’ privacy behavior. This work elaborates on the use of risk scenarios as a strategy for promoting safer privacy decisions in SNSs. In particular, we investigate, through an online survey, the effects of communicating those risks associated with online self-disclosure. Furthermore, we analyze the users’ perceived severity of privacy threats and its importance for the definition of personalized risk awareness mechanisms. Based on our findings, we introduce the design of preventative nudges as an approach for providing individual privacy support and guidance in SNSs.
Persuasion Meets AI: Ethical Considerations for the Design of Social Engineering Countermeasures
Díaz Ferreyra, Nicolás E., Aïmeur, Esma, Hage, Hicham and 2 more authors
In International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (KMIS) 2020
Privacy in Social Network Sites (SNSs) like Facebook or Instagram is closely related to people’s self-disclosure decisions and their ability to foresee the consequences of sharing personal information with large and diverse audiences. Nonetheless, online privacy decisions are often based on spurious risk judgements that make people liable to reveal sensitive data to untrusted recipients and become victims of social engineering attacks. Artificial Intelligence (AI) in combination with persuasive mechanisms like nudging is a promising approach for promoting preventative privacy behaviour among the users of SNSs. Nevertheless, combining behavioural interventions with high levels of personalization can be a potential threat to people’s agency and autonomy even when applied to the design of social engineering countermeasures. This paper elaborates on the ethical challenges that nudging mechanisms can introduce to the development of AI-based countermeasures, particularly to those addressing unsafe self-disclosure practices in SNSs. Overall, it endorses the elaboration of personalized risk awareness solutions as i) an ethical approach to counteract social engineering, and ii) as an effective means for promoting reflective privacy decisions.
Building Trustworthiness in Computer-Mediated Introduction: A Facet-Oriented Framework
Borchert, Angela, Dı́az Ferreyra, Nicolás E., and Heisel, Maritta
In International Conference on Social Media and Society (SMSociety) 2020
Computer-Mediated Introduction (CMI) is the process by which users with compatible purposes interact with one another through social media platforms to meet afterwards in the physical world. CMI covers purposes, such as arranging joint car rides, lodging or dating (e.g. Uber, Airbnb and Tinder). In this context, trust plays a critical role since CMI may involve risks like data misuse, self-esteem damage, fraud or violence. By evaluating the trustworthiness of the information system, its service provider and the other end-user, users decide whether to start and to continue an interaction. Since trustworthiness cues of these three actors are mainly perceived through the graphical user interface of the CMI service, end-users’ trust building is mediated by the information system. Consequently, systems implementing CMI must not only address trustworthiness on a system level but also on a brand and interpersonal level. This work provides a conceptual framework for analyzing facets of trustworthiness that can influence trust in CMI. By addressing these facets in software features, CMI systems can (i) have an impact on their perceived trustworthiness, (ii) shape that of the service provider and (iii) support the mutual trustworthiness assessment of users.
Balancing Trust and Privacy in Computer-Mediated Introduction: Featuring Risk as a Determinant for Trustworthiness Requirements Elicitation
Borchert, Angela, Díaz Ferreyra, Nicolás E., and Heisel, Maritta
In Proceedings of the 15th International Conference on Availability, Reliability and Security (ARES) 2020
In requirements elicitation methods, it is not unusual that conflicts between software requirements or between software goals and requirements can be detected. It is efficient to deal with those conflicts before further costs are invested to implement a solution that includes insufficient software features. This work introduces risk as an extension of a method for eliciting trust-related software features for computer-mediated introduction (CMI) so that software engineers can i) decide on the implementation of conflicting requirements in the problem space and ii) additionally reduce risks that accompany CMI use. CMI describes social media platforms on which strangers with compatible interests get acquainted online and build trust relationships with each other for potential offline encounters (e.g.: online dating and sharing economy). CMI involves security and safety risks such as data misuse, deceit or violence. In the engineering process, software goals and requirements for trust building often come along with the disclosure of personal data, which may result in conflicts with goals and requirements for privacy protection. In order to tackle i) conflicting requirements and goals and ii) CMI risks, our approach involves risk assessment of user concerns and requirements in order to rank goals by their importance for the application. Based on the prioritization, conflicting requirements can be managed. The findings are presented with explicit examples of the application field online dating.
A Conceptual Method for Eliciting Trust-related Software Features for Computer-mediated Introduction
Borchert, Angela, Dı́az Ferreyra, Nicolás E., and Heisel, Maritta
In Proceedings of the 15th International Conference on Evaluation of Novel Approaches to Software Engineering (ENASE) 2020
Computer-Mediated Introduction (CMI) describes the process in which individuals with compatible intentions get to know each other through social media platforms to eventually meet afterwards in the physical world (i.e. sharing economy and online dating). This process involves risks such as data misuse, self-esteem damage, fraud or violence. Therefore, it is important to assess the trustworthiness of other users before interacting with or meeting them. In order to support users in that process and, thereby, reducing risks associated with CMI use, previous work has come up with the approach to develop CMI platforms, which consider users’ trust concerns regarding other users by software features addressing those. In line with that approach, we have developed a conceptual method for requirements engineers to systematically elicit trust-related software features for a safer, user-centred CMI. The method not only considers trust concerns, but also workarounds, trustworthiness facets and tru stworthiness goals to derive requirements as a basis for appropriate trust-related software features. In this way, the method facilitates the development of application-specific software, which we illustratively show in an example for the online dating app Plenty of Fish.
Manipulation and Malicious Personalization: Exploring the Self-Disclosure Biases Exploited by Deceptive Attackers on Social Media
Aïmeur, Esma, Díaz Ferreyra, Nicolás E., and Hage, Hicham
In the real world, the disclosure of private information to others often occurs after a trustworthy relationship has been established. Conversely, users of Social Network Sites (SNSs) like Facebook or Instagram often disclose large amounts of personal information prematurely to individuals which are not necessarily trustworthy. Such a low privacy-preserving behavior is often exploited by deceptive attackers with harmful intentions. Basically, deceivers approach their victims in online communities using incentives that motivate them to share their private information, and ultimately, their credentials. Since motivations, such as financial or social gain vary from individual to individual, deceivers must wisely choose their incentive strategy to mislead the users. Consequently, attacks are crafted to each victim based on their particular information-sharing motivations. This work analyses, through an online survey, those motivations and cognitive biases which are frequently exploited by deceptive attackers in SNSs. We propose thereafter some countermeasures for each of these biases to provide personalized privacy protection against deceivers
Learning from Online Regrets: From Deleted Posts to Risk Awareness in Social Network Sites
Díaz Ferreyra, Nicolás E., Meis, Rene, and Heisel, Maritta
In Adjunct Proceedings of the 27th ACM Conference On User Modelling, Adaptation And Personalization (UMAP) 2019
Social Network Sites (SNSs) like Facebook or Instagram are spaces where people expose their lives to wide and diverse audiences. This practice can lead to unwanted incidents such as reputation damage, job loss or harassment when pieces of private information reach unintended recipients. As a consequence, users often regret to have posted private information in these platforms and proceed to delete such content after having a negative experience. Risk awareness is a strategy that can be used to persuade users towards safer privacy decisions. However, many risk awareness technologies for SNSs assume that information about risks is retrieved and measured by an expert in the field. Consequently, risk estimation is an activity that is often passed over despite its importance. In this work we introduce an approach that employs deleted posts as risk information vehicles to measure the frequency and consequence level of self-disclosure patterns in SNSs. In this method, consequence is reported by the users through an ordinal scale and used later on to compute a risk criticality index. We thereupon show how this index can serve in the design of adaptive privacy nudges for SNSs.
Instructional Awareness: A User-Centred Approach for Risk Communication in Social Network Sites
Often, users of Social Network Sites (SNSs) like Facebook or Twitter find hard to foresee the negative consequences of sharing private information on the Internet. Hence, many users suffer unwanted incidents such as identity theft, reputation damage, or harassment after their private information reaches an unintended audience. Many efforts have been made to develop preventative technologies (PTs) with the purpose of raising the levels of privacy awareness among the users of SNSs. Basically, these technologies generate interventions (i.e. warning messages) when users attempt to disclose private or sensitive information inside these platforms. However, users do not fully engage with PTs because they often perceive their interventions as too invasive or annoying. Basically, this happens because users have different privacy concerns and attitudes that should be considered when generating such interventions. In other words, some users are less concerned about their privacy than others and, consequently, are more willing to disclose private information without carrying much about the consequences. Therefore, PTs should incorporate adaptivity principles to their design in order to successfully nudge the users towards better privacy practices. This thesis focuses in the development of an adaptive approach for generating privacy awareness in SNSs. Particularly, in the elaboration of software artefacts for communicating those privacy risks that may occur when disclosing private information in SNSs. Overall, this covers two main aspects: knowledge extraction and knowledge application. Artefacts for knowledge extraction include the data structures and methods necessary to represent and elicit risky self-disclosure scenarios in SNSs. In this work, privacy heuristics (PHs) are introduced as an alternative for representing such scenarios and as fundamental instruments for the generation of adaptive privacy awareness. Alongside, the artefacts corresponding to knowledge application comprise those methods and algorithms that leverage the information contained inside PHs to shape the corresponding interventions. This includes methods for estimating the risk impact of a self-disclosure act and mechanisms for regulating the content and frequency of warning messages. All of these artefacts collaborate with each other in a conceptual framework that this thesis calls Instructional Awareness.
At Your Own Risk: Shaping Privacy Heuristics for Online Self-disclosure
Díaz Ferreyra, Nicolás E., Meis, Rene, and Heisel, Maritta
In 16th Annual Conference on Privacy, Security and Trust (PST) 2018
Revealing private and sensitive information on Social Network Sites (SNSs) like Facebook is a common practice which sometimes results in unwanted incidents for the users. One approach for helping users to avoid regrettable scenarios is through awareness mechanisms which inform a priori about the potential privacy risks of a self-disclosure act. Privacy heuristics are instruments which describe recurrent regrettable scenarios and can support the generation of privacy awareness. One important component of a heuristic is the group of people who should not access specific private information under a certain privacy risk. However, specifying an exhaustive list of unwanted recipients for a given regrettable scenario can be a tedious task which necessarily demands the user’s intervention. In this paper, we introduce an approach based on decision trees to instantiate the audience component of privacy heuristics with minor intervention from the users. We introduce Disclosure- Acceptance Trees, a data structure representative of the audience component of a heuristic and describe a method for their generation out of user-centred privacy preferences.
Access-Control Prediction in Social Network Sites: Examining the Role of Homophily
Díaz Ferreyra, Nicolás E., Hecking, Tobias, Hoppe, H. Ulrich and 1 more author
In International Conference on Social Informatics (SocInfo) 2018
Often, users of Social Network Sites (SNSs) like Facebook or Twitter have issues when controlling the access to their content. Access-control predictive models are used to recommend access-control configurations which are aligned with the users’ individual privacy preferences. One basic strategy for the prediction of access-control configurations is to generate access-control lists out of the emerging communities inside the user’s ego-network. That is, in a community-based fashion. Homophily, which is the tendency of individuals to bond with others who hold similar characteristics, can influence the network structure of SNSs and bias the users’ privacy preferences. Consequently, it can also impact the quality of the configurations generated by access-control predictive models that follow a community-based approach. In this work, we use a simulation model to evaluate the effect of homophily when predicting access-control lists in SNSs. We generate networks with different levels of homophily and analyse thereby its impact on access-control recommendations.
Towards an ILP Approach for Learning Privacy Heuristics from Users’ Regrets
Díaz Ferreyra, Nicolás E., Meis, Rene, and Heisel, Maritta
In European Network Intelligence Conference (ENIC) 2017
Disclosing private information in Social Network Sites (SNSs) often results in unwanted incidents for the users (such as bad image, identity theft, or unjustified discrimination), along with a feeling of regret and repentance. Regrettable online self-disclosure experiences can be seen as sources of privacy heuristics (best practices) that can help shaping better privacy awareness mechanisms. Considering deleted posts as an explicit manifestation of users’ regrets, we propose an Inductive Logic Programming (ILP) approach for learning privacy heuristics. In this paper we introduce the motivating scenario and the theoretical foundations of this approach, and we provide an initial assessment towards its implementation
Should User-generated Content be a Matter of Privacy Awareness? A position paper
Díaz Ferreyra, Nicolás E., Meis, Rene, and Heisel, Maritta
In International Conference On Knowledge Management and Information Sharing (KMIS) Nov 2017
Social Network Sites (SNSs) like Facebook or Twitter have radically redefined the mechanisms for social interaction. One of the main aspects of these platforms are their information sharing features which allow user-generated content to reach wide and diverse audiences within a few seconds. Whereas the spectrum of shared content is large and varied, it can nevertheless include private and sensitive information. Such content of sensitive nature can derive in unwanted incidents for the users (such as reputation damage, job loss, or harassment) when reaching unintended audiences. In this paper, we analyse and discuss the privacy risks of information disclosure in SNSs from a user-centred perspective. We argue that this is a problem of lack of awareness which is grounded in an emotional detachment between the users and their digital data. In line with this, we will discuss preventative technologies for raising awareness and approaches for building a stronger connection between the users and their private information. Likewise, we encourage the inclusion of awareness mechanisms for providing better insights on the privacy policies of SNSs.
Online Self-disclosure: From Users’ Regrets to Instructional Awareness
Díaz Ferreyra, Nicolás E., Meis, Rene, and Heisel, Maritta
In International Cross-Domain Conference for Machine Learning and Knowledge Extraction Sep 2017