Rock2012′s Blog

07月 15, 2010

用Python写一个程序扫描工具

Filed under: 编程 — rock2012 @ 10:04 am
Tags: , , , ,

有时候我们需要对一个程序文件进行分析,得到这个程序所包含的基本信息,包括PE格式内容,是否packed或执行恶意行为等等。当然,有许多工具可以完成这样的工作。如果我们只想得到特定的信息,而需要分析的程序文件又很多呢?那就动手写一个自己的程序扫描工具吧!使用Python及其第三方工具包,我们可以很轻松的定制出我们自己的扫描工具。我在这篇文章里会拿出一份代码,通过代码的逐步分析来展示出这样一个工具是如何帮助我们完成程序分析的工作。用到的工具包括Python,PeFile,PEID和一个命令行式的恶意软件扫描工具。

首先是开发包的安装:
1.安装Python
2.下载pefile,将得到的压缩包解压到python\lib\site-packages文件夹中
3.在命令行窗口,使用cd命令进入到解压出来的pefile文件夹中,执行python setup.py install
下面,我会使用代码分析的方式描绘出这个工具的工作流程。

## Entropy calculation from Ero Carrera’s blog ############### 
def E(data): 
        entropy = 0   
        if not data: 
                return 0 
        ent = 0 
        for x in range(256): 
                p_x = float(data.count(chr(x)))/len(data
                if p_x > 0: 
                        entropy += - p_x*math.log(p_x, 2
        return entropy

一般来说,程序中的数据越无序,其熵值也越高,这个程序文件被加壳或混淆的可能性也越大。熵值的范围在0.0到8.0之间。更详细的介绍在这里http://blog.dkbza.org/2007/05/scanning-data-for-entropy-anomalies.html。

## Load PEID userdb.txt database and scan file 
def PEID(): 
        signatures = peutils.SignatureDatabase(‘userdb.txt’
        matches = signatures.match_all(pe,ep_only = True
        print “PEID Signature Match(es): “, matches 
        print

PEID的用户数据库(userdb.txt)在这里http://www.peid.info/BobSoft/Downloads.html下载。

## Print Sophos 
def sophos(filetmp): 
        print “Sophos Scan in progress..” 
        output = “None” 
        path = os.path.abspath(filetmp
        pwd = os.getcwd() 
        output = subprocess.call([os.path.join(pwd, 'cmd_scan', 'Sophos', 'SAV32CLI.EXE'), path])

这里用Sophos里包含的命令行式的扫描工具进行程序分析。

其他信息在代码及注释中已经表达得很清楚了。完整的代码如下所示:

## Virustotal Python Scanner script 0.01
## Created by Alexander Hanel

import sys
import os
import math
import time
import datetime
import subprocess
import pefile     #这两个模块都包含在
import peutils    #Pefile中

##############################################################
## Print PE file attributes & metadata
def attributes(): 
        print “Image Base:”, hex(pe.OPTIONAL_HEADER.ImageBase)
        print “Address Of Entry Point:”, hex(pe.OPTIONAL_HEADER.AddressOfEntryPoint)
        machine = 0
        machine = pe.FILE_HEADER.Machine
        print “Required CPU type:”, pefile.MACHINE_TYPE[machine]
        dll = pe.FILE_HEADER.IMAGE_FILE_DLL
        print “DLL:”, dll
        print “Subsystem:”, pefile.SUBSYSTEM_TYPE[pe.OPTIONAL_HEADER.Subsystem]
        print “Compile Time:”, datetime.datetime.fromtimestamp(pe.FILE_HEADER.TimeDateStamp)
        print “Number of RVA and Sizes:”, pe.OPTIONAL_HEADER.NumberOfRvaAndSizes

##############################################################
## Analyze Sections
def sections_analysis():
        print “Number of Sections:”, pe.FILE_HEADER.NumberOfSections
        print
        print “Section  VirtualAddress VirtualSize SizeofRawData Entropy”
        for section in pe.sections:
                print %-8s  % section.Name, %-14s % hex(section.VirtualAddress), %-11s % hex(section.Misc_VirtualSize),\
                      %-13s % section.SizeOfRawData, %.2f % E(section.data)
        print

##############################################################
## Dump Imports
def IAT():
        print “Imported DLLS:”
        i = 1
        for entry in pe.DIRECTORY_ENTRY_IMPORT:
                bool = 1 ## For Formattting
                print %2s % [i], %-17s % entry.dll
                print \t,
                for imp in entry.imports:
                        if bool:
                                print %-1s % imp.name,
                                bool = 0
                        else:
                                sys.stdout.write(%s%s % (“, “,imp.name)) # Python Print adds a blank space
                print
                i += 1
               
##############################################################
## Entropy calculation from Ero Carrera’s blog ###############
def E(data):
        entropy = 0 
        if not data:
                return 0
        ent = 0
        for x in range(256):
                p_x = float(data.count(chr(x)))/len(data)
                if p_x > 0:
                        entropy += - p_x*math.log(p_x, 2)
        return entropy

##############################################################
## Load PEID userdb.txt database and scan file
def PEID():
        signatures = peutils.SignatureDatabase(‘userdb.txt’)
        matches = signatures.match_all(pe,ep_only = True)
        print “PEID Signature Match(es): “, matches
        print

##############################################################
## Print Sophos
def sophos(filetmp):
        print
        print “Sophos Scan in progress..”
        output = “None”
        path = os.path.abspath(filetmp)
        pwd = os.getcwd()
        output = subprocess.call([os.path.join(pwd, 'cmd_scan', 'Sophos', 'SAV32CLI.EXE'), path])
       
## Thanks habnabit
##############################################################

if len(sys.argv) < 2:
        print “Pyton Script <FILE>”
        sys.exit(3)
exename = sys.argv[1]
pe = pefile.PE(exename)
print \nPortable Executable Information”
attributes()
sections_analysis()
PEID()
IAT()
sophos(exename)

## </FILE>  <- Format bug with SyntaxHighlighter (remove line)

来源:http://hooked-on-mnemonics.blogspot.com/2010/04/creating-your-own-virustotal-well-kind.html

发表评论 »

还没有评论。

评论 RSS Feed。 TrackBack URI

发表评论

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / 更改 )

Twitter picture

You are commenting using your Twitter account. Log Out / 更改 )

Facebook photo

You are commenting using your Facebook account. Log Out / 更改 )

Connecting to %s

主题: Rubric. 在WordPress.com的博客.

加关注

Get every new post delivered to your Inbox.