Letter to submit work to 20th Annual (2023) "Humies" Awards For Human-Competitive Results - Produced by Genetic and Evolutionary Computation To be held as part of: Genetic and Evolutionary Computation Conference (GECCO) July 15-19, 2023 (Saturday - Wednesday) Lisbon, Portugal ------------------------------------------------------------------------------------------------------------------- 1. the complete title of one (or more) paper(s) published in the open literature describing the work that the author claims describes a human-competitive result; a. Paper a: Evolving malware variants as antigens for antivirus systems b. Paper b: Adapting novelty towards generating antigens for antivirus systems ------------------------------------------------------------------------------------------------------------------- 2. the name, complete physical mailing address, e-mail address, and phone number of EACH author of EACH paper(s); a. Paper a: Evolving malware variants as antigens for antivirus systems Author 1: Ritwik Murali Address: Dept. of Computer Science & Engineering, Amrita School of Computing, Coimbatore Amrita Vishwa Vidyapeetham University, Coimbatore, Tamil Nadu, India - 641112 email: m_ritwik@cb.amrita.edu phone: +91 (422) 2685 000 Author 2: Palanisamy Thangavel Address: Dept. of Mathematics, Amrita School of Physical Sciences, Coimbatore Amrita Vishwa Vidyapeetham University, Coimbatore, Tamil Nadu, India - 641112 email: t_palanisamy@cb.amrita.edu phone: +91 (422) 2685 000 Author 3: C Shunmuga Velayutham Address: Dept. of Computer Science & Engineering, Amrita School of Computing, Coimbatore Amrita Vishwa Vidyapeetham University, Coimbatore, Tamil Nadu, India - 641112 email: cs_velayutham@cb.amrita.edu phone: +91 (422) 2685 000 b. Paper b: Adapting novelty towards generating antigens for antivirus systems Author 1: Ritwik Murali Address: Dept. of Computer Science & Engineering, Amrita School of Computing, Coimbatore Amrita Vishwa Vidyapeetham University, Coimbatore, Tamil Nadu, India - 641112 email: m_ritwik@cb.amrita.edu phone: +91 (422) 2685 000 Author 2: C Shunmuga Velayutham Address: Dept. of Computer Science & Engineering, Amrita School of Computing, Coimbatore Amrita Vishwa Vidyapeetham University, Coimbatore, Tamil Nadu, India - 641112 email: cs_velayutham@cb.amrita.edu phone: +91 (422) 2685 000 ------------------------------------------------------------------------------------------------------------------- 3. the name of the corresponding author (i.e., the author to whom notices will be sent concerning the competition); Corresponding Author: Ritwik Murali ------------------------------------------------------------------------------------------------------------------- 4. the abstract of the paper(s); a. Paper a: Evolving malware variants as antigens for antivirus systems Abstract: This paper proposes MAGE — A Malware Antigen Generating Evolutionary algorithm that is capable of generating unseen variants of a given source malware. MAGE evolves malware variants by employing code transformation functions as mutation operators and intra-population Jaccard similarity metric as fitness function. By virtue of these design choices, MAGE is capable of generating active malware variants with diverse code structure variations while retaining the maliciousness of the source malware. These malware variants (similar to biological antigens) generated throughout the run of MAGE forms a potential dataset of malware variants. The dataset can be used to train an adaptive Antivirus engine to learn the code structure variations that make up the space of malware variants. This could augment the engines ability to detect unseen malware variants, thus preventing attacks from the same. The efficacy of MAGE has been demonstrated with two malware viz. Timid , a COM infector and Intruder, an EXE infector. The simulation experiments demonstrate the potential and versatility of MAGE towards generating diverse malware variants. b. Paper b: Adapting novelty towards generating antigens for antivirus systems Abstract: It is well known that anti-malware scanners depend on malware signatures to identify malware. However, even minor modifications to malware code structure results in a change in the malware signature thus enabling the variant to evade detection by scanners. Therefore, there exists the need for a proactively generated malware variant dataset to aid detection of such diverse variants by automated antivirus scanners. This paper proposes and demonstrates a generic assembly source code based framework that facilitates any evolutionary algorithm to generate diverse and potential variants of an input malware, while retaining its maliciousness, yet capable of evading antivirus scanners. Generic code transformation functions and a novelty search supported quality metric have been proposed as components of the framework to be used respectively as variation operators and fitness function, for evolutionary algorithms. The results demonstrate the effectiveness of the framework in generating diverse variants and the generated variants have been shown to evade over 98% of popular antivirus scanners. The malware variants evolved by the framework can serve as antigens to assist malware analysis engines to improve their malware detection algorithms. ------------------------------------------------------------------------------------------------------------------- 5. a list containing one or more of the eight letters (A, B, C, D, E, F, G, or H) that correspond to the criteria (see above) that the author claims that the work satisfies; a. Paper a: Evolving malware variants as antigens for antivirus systems B, D, E, F, G b. Paper b: Adapting novelty towards generating antigens for antivirus systems D, E, F, G ------------------------------------------------------------------------------------------------------------------- 6. a statement stating why the result satisfies the criteria that the contestant claims (see examples of statements of human-competitiveness as a guide to aid in constructing this part of the submission); Malware are specific programs designed by malicious actors to damage computing systems by extracting personally identifiable information, holding sensitive data to ransom and even completely controlling a computing device without the knowledge of the end-user. While a typical antivirus scanner scans files in the end users’ computing environment for the signature patterns to identify the malicious entity, it is well known that even minor modifications to the code structure results in the malware variants being able to evade detection by these antivirus scanners and the majority of successful malware attacks are variants of existing malware that escape detection. The research being submitted employs evolutionary algorithms to to create different malware to aid the antivirus scanners in identifying potentially undiscovered malware variants by creating a dataset of diverse malware variants. a. Paper a: Evolving malware variants as antigens for antivirus systems This paper proposes A malware antigen generating evolutionary algorithm (MAGE) that is capable of generating variants of a given malware as “Antigens” for antivirus systems. MAGE is capable of evolving diverse non-trivial assembly code structure variations for a given source malware without changing or affecting the original malicious behaviour. A Pivot based single point crossover and code transformation based mutation functions along with an intra-population Jaccard similarity based fitness function enables MAGE to evolve a valid and trusted dataset of malware variants. MAGE operates directly on assembly code and guarantees valid executables. Though attempted on 32bit windows executables, MAGE is also extensible to 64 by windows executables by updating the languate set. Thus the proposed work is also very suitable for contemporary data driven learning in anti malware softwares. Justification of Criteria B,D,F,G: The work first implemented and compared the existing approach to generating malware variants (Cani, et al((2019)) and empirically proved that the EA based approach for variant generation was able to not only evolve complex code structures, the generated variants were also compilable into valid executables that retained their original malicious characteristics. Thus the published work showcases the results which is not just publishable in its owwn right, but also better than existing approaches to solving long standing problems in the field of malware defense. Justification of Criteria E: Creating variations of code is a popular strategy used in high level programming languages by most malware authors. However, the major challenge was transforming the same such that it can be applied to assembly level programs. The EA based approach proposed in this paper also allowed for automatically generating valid executables that modify the code structure while not changing the functionality of the malwawre. b. Paper b: Adapting novelty towards generating antigens for antivirus systems This work discussed a generic assembly source code based framework that facilitates an evolutionary algorithm to generate diverse and potential variants of an input malware, while retaining its maliciousness, yet capable of evading antivirus scanners. The generic code transformation functions, based on which five transformation function instances were also proposed and defined as mutation operators. The code block interchange transformation function was utilized in the design of the crossover operator to enable seamless recombination resulting in a valid executable. The proposed framework, which utilized a novelty search supported intra-population-based fitness function, was able to evolve both valid as well as diverse variant executables of a source malware. Justification of Criteria D,E,F,G: The work explored Novelty search as a method of generating diverse code structures. The approach allowed for the automatic generation of over 600 valid variants of a single malware. This task is extremely complex for a human to generate. The results also showed that the generated malware were not only able to evade detection by popular antivirus scanners, but also forced some of the scanners to miss-classify the malware variants as a completely different malwawre. ------------------------------------------------------------------------------------------------------------------- 7. a full citation of the paper (that is, author names; title, publication date; name of journal, conference, or book in which article appeared; name of editors, if applicable, of the journal or edited book; publisher name; publisher city; page numbers, if applicable); a. Paper a: Evolving malware variants as antigens for antivirus systems @article{murali2023evolving, title={Evolving malware variants as antigens for antivirus systems}, author={Murali, Ritwik and Thangavel, Palanisamy and Velayutham, C Shunmuga}, journal={Expert Systems with Applications}, volume={226}, pages={120092}, year={2023}, publisher={Elsevier} } b. Paper b: Adapting novelty towards generating antigens for antivirus systems @inproceedings{murali2022adapting, title={Adapting novelty towards generating antigens for antivirus systems}, author={Murali, Ritwik and Velayutham, C Shunmuga}, booktitle={Proceedings of the Genetic and Evolutionary Computation Conference}, pages={1254--1262}, year={2022} } ------------------------------------------------------------------------------------------------------------------- 8. Any prize money, if any, is to be divided equally among the co-authors" OR a specific percentage breakdown as to how the prize money, if any, is to be divided among the co-authors; Any prize money is to be divided equally among the co-authors ------------------------------------------------------------------------------------------------------------------- 9. a statement stating why the authors expect that their entry would be the "best," and There are many attempts to create variants of computer malware. In most cases, the attempt is by exploring high level programming languages or the variants evolved are not always compilable into an executable. What sets this result apart is because it sets the foundation for automated diverse malware generation for proactive defense. This work automatically evolves structurally diverse yet behaviorally intact malware variants. In fact it also promises possibility /feasibility of evolving malware with behavioral diversity. The evolutionary algorithm evolved complex code structures that are well beyond human capacity and was able to evolve divergent variants which were misclassified by the AV scanners with a few of the scanners wrongly identifying the virus family itself. By proactively generating variants through code transformation that result in valid executables, the generated variants act as instances for a typical learning algorithm to create signatures. These signatures could be then updated in the antivirus database to trigger the immune response if any of these viruses are discovered in a client system. Thus, these malware variants (similar to biological antigens) have the potential to help improve adaptive AV engines to achieve active acquired immunity against unseen variants of a given malware, thereby showcasing the potential of automated evolution. ------------------------------------------------------------------------------------------------------------------- 10. An indication of the general type of genetic or evolutionary computation used, such as GA (genetic algorithms), GP (genetic programming), ES (evolution strategies), EP (evolutionary programming), LCS (learning classifier systems), GI (genetic improvement), GE (grammatical evolution), GEP (gene expression programming), DE (differential evolution), etc. GP ------------------------------------------------------------------------------------------------------------------- 11. The date of publication of each paper. If the date of publication is not on or before the deadline for submission, but instead, the paper has been unconditionally accepted for publication and is “in press” by the deadline for this competition, the entry must include a copy of the documentation establishing that the paper meets the "in press" requirement. a. Paper a: Evolving malware variants as antigens for antivirus systems Available online 16 April 2023 b. Paper b: Adapting novelty towards generating antigens for antivirus systems Published:08 July 2022 -------------------------------------------------------------------------------------------------------------------