Click here to sign up. Download Free PDF. Computational Collective Intelligence. A short summary of this paper. It is well known that cyber criminal gangs are already using ad- vanced and especially intelligent types of Android malware, in order to over- come the out-of-band security measures. This is done in order to broaden and enhance their attacks which mainly target financial and credit foundations and their transactions. It is a fact that most applications used under the Android sys- tem are written in Java.
The research described herein, proposes the develop- ment of an innovative active security system that goes beyond the limits of the existing ones. Its main task is the analysis and classification of the Java classes of each application. They are even involved in security systems in order to enter mobile devices and steel crucial data like the Two-Factor Authentication 2FA sent by the banks.
In this way the offer the cyber criminals the chance to have access in accounts despite the existence of additional protection level. Also this type of malware is characterized by a series of upgraded characteristics allowing the attackers to change the device control between HTTP and SMS, regardless the availability of Internet connection. They can also be used for the development of portable botnets and for spying on their victims.
DOI: Demertzis and L. Iliadis 1. APK file. Such a typical application includes the An ndroid Manifest. On the other hand, ART creates and stores the innter- preted ML during the instaallation of the application. A ART uses the same encoding wiith Dalvik, by keeping the. A APK files, but it replaces the. The classes are orrga- nized in the. The namee of the file is identical to the naame of the contained public class.
The ART loads the cclas- ses required to execute the Java J program class loader and then it verifies the validdity of the byte code files befo ore execution byte code verifier [3]. The JCFA proccess includes also the analysis ofo the classes, methods and specific characteristics included in an application. It is really important that this is achieved by consuming minimum computational resources. Yerima et al. Also, in [11], PUMA Permission usage to detect malware in Android detects malicious Android applications through machine-learning techniques by analyzing the extracted permissions from the application itself.
Dini et al. On the other hand Dan Simon [13] employed the BBO algorithm on a real-world sensor selection problem for aircraft engine health estimation. Panchal et al. Ovreiu et al. In this work we employ 11 standard da- tasets to provide a comprehensive test bed for investigating the abilities of BBO in training MLPs.
Finally Mirjalili et al. Iliadis encies during the loading of classes, introduces Intelligence in compiler level. This fact enhances the defensive capabilities of the system significantly. It is important that the dependencies and the structural elements of an application are checked before its installation enabling the malware cases. An important innovative part of this research is related to the choice of the inde- pendent parameters, which was done after several exhaustive tests, in order to ensure the maximum performance and generalization of the algorithm and the consumption of the minimum resources.
For example it is the first time that such a system does not consider as independent parameters, the permissions required by an application for her installation and execution, unlike all existing static or dynamic malware location analysis systems so far. Finally, it is worth mentioning that the BBO optimization algorithm popular for engineering cases is used for the first time to train an Artificial Neural Network ANN for a real information security problem.
The introduction of the files in the ARTJVM, always passes from the above level, where the check for malicious classes is done. If malicious classes are detected, deci- sions are done depending on the accuracy of the classification.
If the accuracy is high, then the decisions are done automatically, otherwise the actions are imposed by the user regarding the acceptance or rejection of the application installation. In the case that the classes are benign the installation is performed normally and the user is noti- fied that this is a secure application. The proposed architecture is presented in the following figure 2. The proposed architecture of the SAME kdemertz fmenr.
The relative security systems presented in 1. Of course the above process aims to spot the malware. This innovation en- hances the energetic security of the Android mobile phones significantly. It also cre- ates new perspectives in the design architecture of the operating systems, which adopt smart defense mechanisms against sophisticated attacks.
The process of feature extraction was done based on the Python language, combined with the JSON technique for rep- resenting simple data structures as described in [20]. The full list of the 24 features, including the class Benign or Mali- cious is presented in the following table 1. Table 1. Thus, an effort has been made to kdemertz fmenr. This was done in order for the new linear combinations to contain the biggest part of the variance of the initial information.
This method was not selected because the results were not as good as expected. The next step was the use of correlation analysis. Also we tested vectors based on the cost-sensitive classification by using the cost-matrix.
Additionally vectors were tested by estimating the value of each feature based on the information gain for each class Information Gain Attribute Evaluation. Finally the optimal subset was selected based on a genetic algorithm which optimized the classification error for the training and testing data in correlation with the value of each feature [21]. The 12 features used in the final dataset are described in the following table 2. Table 2. The Entropy value was estimated for each class as degree of uncertainty, 11 entropy with the highest values recorded for the malicious classes.
Generally, there are three heuristic approaches used to train or optimize MLPs. First, heuristic algorithms are used to find a combination of weights and biases which provide the minimum error for a MLP. Second, heuristic algorithms are employed to find the optimal architecture for the network related to a specific case. The last approach is to use an evolutionary algorithm to tune the parameters of a gradi- ent-based learning algorithm, such as the learning rate and momentum [22].
In the first method, the architecture does not change during the learning process. The training algo- rithm is required to find proper values for all connection weights and biases in order to minimize the overall error of the MLP. The most important parts of MLPs are the con- nection weights and biases because these parameters define the final values of output.
Training an MLP involves finding optimum values for weights and biases in order to achieve desirable outputs from certain given inputs. Each island is a solution to the problem whereas the species are the parame- ters of each solution. The areas with the high suitability values offer the best solutions. Each solution becomes optimal by taking characteristics or species from other areas through the immigration mechanism. The algorithm is terminated either after a prede- fined number of iterations or after achieving a target.
An essential difference with the corresponding biologically inspired optimization algorithms is that in the BBO there is no reproduction or offspring concept. There are no new solutions produced from iteration to iteration, but the existing ones are evolving. When there are changes or new solutions the parameters are notified by exchanging characteristics through the immigration [13]. Iliadis F 3. Architecture of proposed MLP Fig.
The output vector of thee MLP comprises of the two potential classification vallues benign or malicious. The arrchitecture of the MLP is shown in the figure 3. The proposed method starts s by generating a random set of MLPs based on the defined number of habitats BBO employs a number of search agents called habiitats which are analogous to chrromosomes in GAs.
Afterwards, the MLPs are combined based on the emigration and immigrattion rates. The term, coined by Jeff Howe [6], vehicles associated to those stops. The idea of crowdsour- Other available tools are based on the principle of visuali- cing is to send a task to the crowd instead of running it using zing information on maps, including: OnibusRecife which their own resources.
WikiCrimes cars in Toronto, Canada, with information provided, in real [5] enables people to share information about the occurrence time, by the transportations companies; and Bus Maps Lon- of criminal acts in a particular location. Vivacqua and Bor- don9 , which shows static information about bus stops, routes ges [13] propose the usage of collective intelligence capabi- and schedules for the city of London. Most solutions also strongly for information about corruption in Brazil.
A feasible alternative to obtain real-time to gather information about various aspects relevant to so- information is to create an intelligent transportation system ciety such as corruption, crime and violence. The increasing demand and use of collective intelligence ap- plications indicates that people tend to collaborate in order to build a collective knowledge, since such knowledge is be- 3. Thus, tionnaire answered by public transportation users in Brazil. The following steps 2.
During the planning activity, we defined the research hy- 6 When delays occur in the line you use, what kind of pro- potheses, which were used to formulate the questionnaire. The hypotheses are: 7 Suppose you are at the bus stop, waiting for the vehicle of a given line. If there is any delay,do you have a way to become aware of what happened, before the vehicle reaches the stop?
The questions in the questionnaire were divided in two main 5 If there was an application to display real-time informa- groups. The first one included twelve questions Table 1 re- tion about the bus lines you use, would you consult it? The se- Internet? Most questions are multiple choice type and it was ponses. The results of this activity will be described in next section.
The questionnaire was created using LimeSurvey 10 and made available electronically We conducted the survey 4.
The questionnaire To perform the data analysis activity, we considered the URL was published on social networks Twitter12 , Face- valid responses, involving 94 cities in 21 different Brazi- book13 and Orkut14 and was announced to several people lian states. Six states appeared in the results with a single in different cities. We have the state capital and Rio de Janeiro 66 responses, 57 from collected a total of responses.
During the data prepa- the state capital. The distribution of responses per state ration, 13 responses were dismissed as they presented some can be found in Table 3. The remaining respon- Table 3: Number of response by state dents checked this option in conjunction with other s. This State Number of re- plies may occur due to the fact that the information is not always Bahia available in one way or another. Out of these 32, only Distrito Federal 10 Amazonas 9 seven people live in capitals.
Other people said that the bus rarely or never arrives at the stop on time. In addition, people said they do not know if the bus is delayed because they are not aware of the time it was expected to arrive. This shows that information provided on the timetables are not enough to make the passenger aware of the time the vehicle should arrive at the stop. Figure 1: Frequency of public transportation usage 4.
In addition, we intended to ve- We asked respondents what were the problems that often rify, when information are available, if they are sufficient cause delays in vehicles Figure 4. The answers showed and complete. Flooding were reported ti- We asked respondents how they become aware of the time mes and 29 other people pointed to other security incidents. It was possible people reported that other not listed problems cause to check more than one choice among: i via Internet, ii delays.
Among the cited problems, 33 refer to mechanical by telephone, iii printed timetables or guides, iv asking problems, 24 to uncommitted drivers, 17 to transportation other people, v there is no way to be aware, I go to the companies planning failures and 11 to the lack of transpor- stop and wait for the vehicle, and vi other.
The number tation vehicles. Figure 4: Reasons of delays Figure 2: Ways users search for arrival time infor- As it can be noted, some of the problems can be solved by mation administrative measures, such as planning issues and lack of vehicles. To find out if this information is available, we asked the respondents if they know ways to become aware of a delay or event that cause a delay before the vehicle reaches the stop Figure 5.
It shows that the available information about public transportation are generally based on estimations and do not represent the current real-time information. Furthermore, it shows that users require more accurate information regarding public transportation. Figure 5: Availability and importance of informa- tion about delayed vehicles Figure 6: Types of information found when passen- gers use a line for the first time We also asked if people seek for information about the lines when they use them for the first time.
And, if they do so, what kind of information they are able to find Figure 6. Among them, said they can find the departure time from the line origin and In this regard, we asked about the way they get informa- answered they can check the location of the interme- tion about bus timetables Figure 2. The other alternatives were the ti- they obtain information via Internet.
This shows that the metables for intermediate stops 64 answers , statistics of Internet is the most used way to obtain information, but delays and other events 22 answers , and other informa- collective knowledge is also extensively used.
By analyzing tion 33 answers. Among other information, the most cited the people who checked both options, we come to a total of was the line route or itinerary 18 answers. This number can be explai- ned because people are not aware of the available resources Another question that confirms the data presented above to access these information. Therefore, it is not enough to concerns the way users look for information when using a make the information available, but it is also necessary to line for the first time Figure 7.
From the people that make them easily accessible to the different categories of seek for information, said they obtain it through ac- people who need it.
To the same question, Based on the data discussed on this section, we can con- people answered they use the Internet as a way to obtain clude that people generally seek for information about pu- information. Only 41 people seek for the company services blic transportation. However, the information found is sta- and other 30 reported other options. When analyzing the tic, taking into account estimations provided by the com- group of people that ask acquaintances or other people and panies.
Therefore, there are ways to become aware of delays before the vehicle this study confirms the hypothesis that public transporta- arrives at a stop. When asked about the way this infor- tion users have very few ways to get information about the mation is obtained, ten people said they check on terminals lines they use and the information obtained is incomplete available at the stops or ask the transportation company, and static. The goal of the second hypothesis was to determine whether passengers make use of the knowledge shared by other people We also observed the tendency of using social networks to to obtain information about public transportation lines.
These facts confirm Hypothesis 2. To do so, respondents were asked if they participate in any social network Figure Figure 7: How users seek for information when using a line for the first time red in public transportation Figure 8. We asked if, cur- rently, they post events related to the traffic on their social networks.
Figure Social Network usage People were also asked about their trust on information about bus lines posted on the Internet Figure From the remaining responses, said they trust on the information, regar- Regarding the use of information shared via Internet, we as- dless of who posted it, without considering the information ked passengers whether they would use an application that source.
And they would use such an application. The analysis of these data shows that, in general, people prefer to have information supplied by transportation com- panies.
According to the presented data, it is possible to say that most passengers get information by asking other people i. In addition, there is a ten- The focus of Hypothesis 4 was to check whether applications dency to share and search for information due to the high for mobile devices can be used as a way to share informa- amount of passengers that already use the Internet to do it.
To retrieve this infoma- Confirming this tendency, many passengers said that they tion, we firstly asked passengers if they make use of Mobile already inform events that occur in public transportation or Internet. This resource is interesting because it increases responses of people living in big cities.
This indicates that the power of collective intelligence. From the results of this research, we are able to elicit the requirements that address the needs of the UbiBus project end users. By using these applications, the citizens will be able to use the Internet, mobile devices and social networks to indicate, in real time, events that help to map the routes situation and the public transportation conditions. From the information provided by several passengers, the system will present, for example, lines delays, vehicle problems, best Figure Amount of people that use mobile Inter- or worst routes at a given moment, the best or worst lines net and bus companies, more or less dangerous regions, and so on.
0コメント