Day: August 26, 2019

How to build a convolution neural network based malware detector using malware

first_imgDeep Learning (DL), a subfield of machine learning, arose to help build algorithms that work like the human mind and are inspired by its structure. Information security professionals are also intrigued by such techniques, as they have provided promising results in defending against major cyber threats and attacks. One of the best-suited candidates for the implementation of DL is malware analysis. This tutorial is an excerpt taken from the book, Mastering Machine Learning for Penetration Testing written by Chiheb Chebbi. In this book, you will learn to identify ambiguities, extensive techniques to breach an intelligent system, and much more. In this post, we are going to explore artificial network architectures and learn how to use one of them to help malware analysts and information security professionals to detect and classify malicious code. Before diving into the technical details and the steps for the practical implementation of the DL method, it is essential to learn and discover the other different architectures of artificial neural networks. Convolutional Neural Networks (CNNs) Convolutional Neural Networks (CNNs) are a deep learning approach to tackle the image classification problem, or what we call computer vision problems, because classic computer programs face many challenges and difficulties to identify objects for many reasons, including lighting, viewpoint, deformation, and segmentation. This technique is inspired by how the eye works, especially the visual cortex function algorithm in animals. In CNN are arranged in three-dimensional structures with width, height, and depth as characteristics. In the case of images, the height is the image height, the width is the image width, and the depth is RGB channels. To build a CNN, we need three main types of layer: Convolutional layer: A convolutional operation refers to extracting features from the input image and multiplying the values in the filter with the original pixel values Pooling layer: The pooling operation reduces the dimensionality of each feature map Fully-connected layer: The fully-connected layer is a classic multi-layer perceptrons with a softmax activation function in the output layer To implement a CNN with Python, you can use the following Python script: import numpyfrom keras.datasets import mnistfrom keras.models import Sequentialfrom keras.layers import Densefrom keras.layers import Dropoutfrom keras.layers import Flattenfrom keras.layers.convolutional import Conv2Dfrom keras.layers.convolutional import MaxPooling2Dfrom keras.utils import np_utilsfrom keras import backend backend.set_image_dim_ordering(‘th’) model = Sequential()model.add(Conv2D(32, (5, 5), input_shape=(1, 28, 28), activation=’relu’))model.add(MaxPooling2D(pool_size=(2, 2)))model.add(Dropout(0.2))model.add(Flatten())model.add(Dense(128, activation=’relu’))model.add(Dense(num_classes, activation=’softmax’))model.compile(loss=’categorical_crossentropy’, optimizer=’adam’, metrics=[‘accuracy’]) Recurrent Neural Networks (RNNs) Recurrent Neural Networks (RNNs) are artificial neural networks where we can make use of sequential information, such as sentences. In other words, RNNs perform the same task for every element of a sequence, with the output depending on the previous computations. RNNs are widely used in language modeling and text generation (machine translation, speech recognition, and many other applications). RNNs do not remember things for a long time. Long Short Term Memory networks Long Short Term Memory (LSTM) solves the short memory issue in recurrent neural networks by building a memory block. This block sometimes is called a memory cell. Hopfield networks Hopfield networks were developed by John Hopfield in 1982. The main goal of Hopfield networks is auto-association and optimization. We have two categories of Hopfield network: discrete and continuous. Boltzmann machine networks Boltzmann machine networks use recurrent structures and they use only locally available information. They were developed by Geoffrey Hinton and Terry Sejnowski in 1985. Also, the goal of a Boltzmann machine is optimizing the solutions. Malware detection with CNNs For this new model, we are going to discover how to build a malware classifier with CNNs. But I bet you are wondering how we can do that while CNNs are taking images as inputs. The answer is really simple, the trick here is converting malware into an image. Is this possible? Yes, it is. Malware visualization is one of many research topics during the past few years. One of the proposed solutions has come from a research study called Malware Images: Visualization and Automatic Classification by Lakshmanan Nataraj from the Vision Research Lab, University of California, Santa Barbara. The following diagram details how to convert malware into an image: The following is an image of the Alueron.gen!J malware: This technique also gives us the ability to visualize malware sections in a detailed way: By solving the issue of how to feed malware machine learning classifiers that use CNNs by images, information security professionals can use the power of CNNs to train models. One of the malware datasets most often used to feed CNNs is the Malimg dataset. This malware dataset contains 9,339 malware samples from 25 different malware families. You can download it from Kaggle (a platform for predictive modeling and analytics competitions) by visiting this link: https://www.kaggle.com/afagarap/malimg-dataset/data. These are the malware families: Allaple.L Allaple.A Yuner.A Lolyda.AA 1 Lolyda.AA 2 Lolyda.AA 3 C2Lop.P C2Lop.gen!G Instant access Swizzor.gen!I Swizzor.gen!E VB.AT Fakerean Alueron.gen!J Malex.gen!J Lolyda.AT Adialer.C Wintrim.BX Dialplatform.B Dontovo.A Obfuscator.AD Agent.FYI Autorun.K Rbot!gen Skintrim.N After converting malware into grayscale images, you can get the following malware representation so you can use them later to feed the machine learning model: The conversion of each malware to a grayscale image can be done using the following Python script: import osimport scipyimport array filename = ”; f = open(filename,’rb’); ln = os.path.getsize(filename);width = 256; rem = ln%width; a = array.array(“B”); a.fromfile(f,ln-rem); f.close(); g = numpy.reshape(a,(len(a)/width,width)); g = numpy.uint8(g); scipy.misc.imsave(‘.png’,g); For feature selection, you can extract or use any image characteristics, such as the texture pattern, frequencies in image, intensity, or color features, using different techniques such as Euclidean distance, or mean and standard deviation, to generate later feature vectors. In our case, we can use algorithms such as a color layout descriptor, homogeneous texture descriptor, or global image descriptors (GIST). Let’s suppose that we selected the GIST; pyleargist is a great Python library to compute it. To install it, use PIP as usual: # pip install pyleargist==1.0.1 As a use case, to compute a GIST, you can use the following Python script: import ImageImport leargist image = Image.open(‘.png’); New_im = image.resize((64,64));des = leargist.color_gist(New_im);Feature_Vector = des[0:320]; Here, 320 refers to the first 320 values while we are using grayscale images. Don’t forget to save them as NumPy arrays to use them later to train the model. After getting the feature vectors, we can train many different models, including SVM, k-means, and artificial neural networks. One of the useful algorithms is that of the CNN. Once the feature selection and engineering is done, we can build a CNN. For our model, for example, we will build a convolutional network with two convolutional layers, with 32 * 32 inputs. To build the model using Python libraries, we can implement it with the previously installed TensorFlow and utils libraries. So the overall CNN architecture will be as in the following diagram: This CNN architecture is not the only proposal to build the model, but at the moment we are going to use it for the implementation. To build the model and CNN in general, I highly recommend Keras. The required imports are the following: import keras from keras.models import Sequential,Input,Model from keras.layers import Dense, Dropout, Flatten from keras.layers import Conv2D, MaxPooling2D from keras.layers.normalization import BatchNormalization from keras.layers.advanced_activations import LeakyReLU As we discussed before, the grayscale image has pixel values that range from 0 to 255, and we need to feed the net with 32 * 32 * 1 dimension images as a result: train_X = train_X.reshape(-1, 32,32, 1)test_X = test_X.reshape(-1, 32,32, 1) We will train our network with these parameters: batch_size = 64epochs = 20num_classes = 25 To build the architecture, with regards to its format, use the following: Malware_Model = Sequential()Malware_Model.add(Conv2D(32, kernel_size=(3,3),activation=’linear’,input_shape=(32,32,1),padding=’same’))Malware_Model.add(LeakyReLU(alpha=0.1))Malware_model.add(MaxPooling2D(pool_size=(2, 2),padding=’same’))Malware_Model.add(Conv2D(64, (3, 3), activation=’linear’,padding=’same’))Malware_Model.add(LeakyReLU(alpha=0.1))Malware_Model.add(Dense(1024, activation=’linear’))Malware_Model.add(LeakyReLU(alpha=0.1))Malware_Model.add(Dropout(0.4))Malware_Model.add(Dense(num_classes, activation=’softmax’)) To compile the model, use the following: Malware_Model.compile(loss=keras.losses.categorical_crossentropy, optimizer=keras.optimizers.Adam(),metrics=[‘accuracy’]) Fit and train the model: Malware_Model.fit(train_X, train_label, batch_size=batch_size,epochs=epochs,verbose=1,validation_data=(valid_X, valid_label)) As you noticed, we are respecting the flow of training a neural network that was discussed in previous chapters. To evaluate the model, use the following code: Malware_Model.evaluate(test_X, test_Y_one_hot, verbose=0)print(‘The accuracy of the Test is:’, test_eval[1]) Thus, in this post, we discovered how to build malware detectors using different machine learning algorithms, especially using the power of deep learning techniques.  If you’ve enjoyed reading this post, do check out Mastering Machine Learning for Penetration Testing to find loopholes and surpass a self-learning security system Read Next This AI generated animation can dress like humans using deep reinforcement learning DeepCube: A new deep reinforcement learning approach solves the Rubik’s cube with no human help “Deep meta reinforcement learning will be the future of AI where we will be so close to achieving artificial general intelligence (AGI)”, Sudharsan Ravichandiranlast_img read more

The Ember project announces version 37 of Emberjs Ember Data and Ember

first_imgAfter releasing Ember 3.6 last month, the team behind the Ember project released version 3.7 of Ember.js, Ember Data, and Ember CLI, last week. As always, Ember 3.7 embarks the start of 3.8 beta cycle for all the subprojects. This version drops support for Babel 6 and Node 4, along with a few bug fixes and performance improvements. There are no changes in the Ember Data subproject. Updates in Ember.js 3.7 In Ember.js 3.7, the support for Node 4 has been dropped explicitly and if you want to upgrade to this version you need to first upgrade your Node version. Also, Node 6 support is planned to end in the next few months. Updates in Ember CLI The last usage of Babel 6 removed: The last usage of Babel 6 is removed in Ember CLI 3.7. Babel 6 was used for supporting compiling templates in addon/. It was also used for supporting addon-test-support/ in the addons that do not have any .js processors. Since the module compilation between Babel 6 and Babel 7 is compatible, this update is not a breaking change. Compatibility section in addon README: Another update is a Compatibility section in addon README. Whenever a new addon is generated using Ember CLI, a README file is also generated for the addon. This README will now come with a compatibility section, which enables you to easily communicate to users about what are the requirements to use the addon. You can upgrade to Ember CLI using the following commands: npm install -g ember-cli-update ember-cli-update Read Next The Ember project releases version 3.5 of Ember.js, Ember Data, and Ember CLI The Ember project announces version 3.4 of Ember.js, Ember Data, and Ember CLI Ember project releases v3.2.0 of Ember.js, Ember Data, and Ember CLIlast_img read more

Riot games is struggling with sexism and lack of diversity employees plan

first_imgUpdate as on 6 May, 2019: Riot Games announced early Friday that they will soon start giving new employees the option to opt-out of some mandatory arbitration requirements when they are hired. The catch – The arbitration will initially narrowly focused on a specific set of employees for a specific set of causes. Riot games employees are planning to walkout in protest of the company’s sexist culture and lack of diversity. Riot has been in the spotlight since Kotaku published a detailed report highlighting how five current and former Riot employees filed lawsuits against the company citing the sexist culture that fosters in Riot. Out of the five employees, two were women. Per Kotaku, last Thursday, Riot filed a motion to force two of those women, whose lawsuits revolved around the California Equal Pay Act, into private arbitration. In their motions, Riot’s lawyer argues that these employees waived their rights to a jury trial when they signed arbitration agreements upon their hiring. Private arbitration makes these employees less likely to win against Riot. In November last year, 20,000 Google employees along with Temps, Vendors, and Contractors walked out to protest the discrimination, racism, and sexual harassment encountered at Google’s workplace. This Walkout lead to Google ending forced arbitration for its full-time employees. Google employees are also organizing a phone drive, in a letter published on Medium, to press lawmakers to legally end forced arbitration. Per the Verge, “The employees are organizing a phone bank for May 1st and asking for people to make three calls to lawmakers — two to the caller’s senators and one to their representative — pushing for the FAIR Act, which was recently reintroduced in the House of Representatives.” Following Google, Facebook also made changes to its forced arbitration policy for sexual harassment claims Not only sexual harassment, game developers also undergo unfair treatment in terms of work conditions, job instability, and inadequate pay. In February, The American Federation of Labor and Congress of Industrial Organizations (AFL-CIO), published an open letter on Kotaku. The letter urges the video game industry workers to unionize and voice their support for better treatment within the workplace. Following this motion, Riot employees have organized to walkout in protest demanding Rio leadership to end force arbitration against the two current employees. This walkout is planned for Monday, May 6. An internal document from Riot employees as seen by Kotaku describes the demands laid out by walkout organizers a clear intention to end forced arbitration, a precise deadline (within 6 months) by which to end it a commitment to not force arbitration on the women involved in the ongoing litigations against Riot. Riot’s sexist culture and lack of diversity The investigation conducted by Kotaku last year unveiled some major flaws in Riot’s culture and in gaming companies, in general. Over 28 current and former Riot employees, spoke to Kotaku with stories that echoed of Riot’s female employees being treated unfairly and being on the receiving end of gender discrimination. An employee named Lucy told Kotaku that on thinking of hiring a woman in the leadership role she heard plenty of excuses for why her female job candidates weren’t Riot material. Some were “ladder climbers.” Others had “too much ego.” Most weren’t “gamer enough.” A few were “too punchy,” or didn’t “challenge convention”, she told Kotaku. She also shared her personal experiences facing discrimination. Often her manager would imply that her position was a direct result of her appearance. Every few months, she said, a male boss of hers would comment in public meetings about how her kids and husband must really miss her while she was at work. Women are often told they don’t fit in the company’s ‘bro culture’; an astonishing eighty percent of Riot employees are men, according to data Riot collected from employees’ driver’s licenses. “The ‘bro culture’ there is so real,” said one female source to Kotaku, who said she’d left the company due to sexism. “It’s agonizingly real. It’s like working at a giant fraternity.” Among other people Kotaku interviewed, stories were told on how women were being groomed for promotions, and doing jobs above their title and pay grade, until men were suddenly brought in to replace them. Another women told Kotaku, “how a colleague once informed her, apparently as a compliment, that she was on a list getting passed around by senior leaders detailing who they’d sleep with.” Two former employees also added that they “felt pressure to leave after making their concerns about gender discrimination known.” Many former Riot employees also refused to come forward to share their stories and refrained from participating in the walkout. For some, this was in fear of retaliation from Riot’s fanbase; Riot is the creator of the popular game League of Legends. Others told that they were restricted from talking on the record because of non-disparagement agreements they signed before leaving the company. The walkout threat spread far enough that it prompted a response from Riot’s chief diversity officer, Angela Roseboro, in the company’s private Slack over the weekend reports Waypoint. In a copy of the message obtained by Waypoint, Roseboro says“ We’re also aware there may be an upcoming walkout and recognize some Rioters are not feeling heard. We want to open up a dialogue on Monday and invite Rioters to join us for small group sessions where we can talk through your concerns, and provide as much context as we can about where we’ve landed and why. If you’re interested, please take a moment to add your name to this spreadsheet. We’re planning to keep these sessions smaller so we can have a more candid dialogue.” Riot CEO Nicolo Laurent also acknowledged the talk of a walkout in a statement “We’re proud of our colleagues for standing up for what they believe in. We always want Rioters to have the opportunity to be heard, so we’re sitting down today with Rioters to listen to their opinions and learn more about their perspectives on arbitration. We will also be discussing this topic during our biweekly all-company town hall on Thursday. Both are important forums for us to discuss our current policy and listen to Rioter feedback, which are both important parts of evaluating all of our procedures and policies, including those related to arbitration.” Tech worker union, Game workers unite, Googlers for ending forced arbitration have stood up in solidarity with Riot employees. “Forced arbitration clauses are designed to silence workers and minimize the options available to people hurt by these large corporations” “Employees at Riot Games are considering a walkout, and the organization efforts has prompted an internal response from company executives”, tweeted Coworker.org Others have also joined in support. Read Next #NotOkGoogle: Employee-led town hall reveals hundreds of stories of retaliation at Google DataCamp reckons with its #MeToo movement; CEO steps down from his role indefinitely Microsoft’s #MeToo reckoning: female employees speak out against workplace harassment and discriminationlast_img read more

JupyterHub 10 releases with named servers support for TLS encryption and more

first_imgJupyterHub 1.0 was released last week as the first major update since 2015. JupyterHub allows multiple users to use Jupyter notebook. JupyterHub 1.0 comes with UI support for managing named servers, and TLS encryption and authentication support, among others. What’s new in JupyterHub 1.0? UI for named servers JupyterHub 1.0 comes with full UI support for managing named servers. Named servers allow each Jupyterhub user to have access to more than one named server. JupyterHub 1.0 introduces a new UI for managing these servers. Users can now create/start/stop/delete their servers from the hub home page. Source: Jupyter blog TLS encryption and authentication JupyterHub 1.0 supports TLS encryption and authentication of all internal communication. Spawners must implement .move_certs method to make certificates available to the notebook server if it is not local to the Hub. Currently, local spawners and DockerSpawner support internal ssl. Checking and refreshing authentication JupyterHub. 1.0 introduces three new configurations to refresh or expire authentication information. c.Authenticator.auth_refresh_age allows authentication to expire after a number of seconds. c.Authenticator.refresh_pre_spawn forces a refresh of authentication prior to spawning a server, effectively requiring a user to have up-to-date authentication when they start their server. Authenticator.refresh_auth defines what it means to refresh authentication and can be customized by Authenticator implementations. Other changes A new API is added in JupyterHub 1.0 for registering user activity. Activity is now tracked by pushing it to the Hub from user servers instead of polling the proxy API. Dynamic options_form callables may now return an empty string which will result in no options form being rendered. Spawner.user_options is persisted to the database to be re-used so that a server spawned once via the form can be re-spawned via the API with the same options. c.PAMAuthenticator.pam_normalize_username, option is added for round-tripping usernames through PAM to retrieve the normalized form. c.JupyterHub.named_server_limit_per_user configuration is added to limit the number of named servers each user can have. The default is 0, for no limit. API requests to HubAuthenticated services (e.g. single-user servers) may pass a token in the Authorization header, matching authentication with the Hub API itself. Authenticator.is_admin(handler, authentication) method and Authenticator.admin_groups configuration is added for automatically determining that a member of a group should be considered an admin. These are just a select few updates. For the full list of new features and improvements in JupyterHub 1.0, visit the changelog. You can upgrade jupyterhub with conda or pip: conda install -c conda-forge jupyterhub==1.0.* pip install –upgrade jupyterhub==1.0.* Users were quite excited about the release. Here are some comments from a Hacker News thread. “This is really cool and I’m impressed by the jupyter team. My favorite part is that it’s such a good product that beats the commercial products because it’s hard to figure out, I think, commercial models that support this wide range of collaborators (people who view once a month to people who author every day).” “Congratulations! JupyterHub is a great project with high-quality code and docs. Looking forward to trying the named servers feature as I run a JupyterHub instance that spawns servers inside containers based on a single image which inevitably tends to grow as I add libraries. Being able to manage multiple servers should allow me to split the image into smaller specialized images.” Read Next Introducing Jupytext: Jupyter notebooks as Markdown documents, Julia, Python or R scripts How everyone at Netflix uses Jupyter notebooks from data scientists, machine learning engineers, to data analysts. 10 reasons why data scientists love Jupyter notebookslast_img read more