2014년 12월 27일 토요일

Insert Clob data to MongoDB using Python

I am Migrating the data from oracle to Mongo DB using python,  while migrating i am able to read the clob object using clob.read(), but while inserting into mongo DB it is throwing an exception saying 
<class 'bson.errors.InvalidStringData'>
Traceback (most recent call last):
  File "test.py", line 39, in <module>
    db.test234.insert(i)
  File "C:\Python27\lib\site-packages\pymongo\collection.py", line 409, in insert
    gen(), check_keys, self.uuid_subtype, client)
InvalidStringData: strings in documents must be valid UTF-8: 'Malicious Attack Driver\r\n                            -----------------------\r\n\r\nThis is an effort to (Malicious attack
 driver) comprising of wrapper routines to provide test script infrastructure to  run different attack tools,vulnerability scanners,hacker tools such as . The objective
is to provide common APIs across all the protocols which can run the attack/test from a remote.

Oracle data type is 
('REVIEW_DESCRIPTION', <type 'cx_Oracle.CLOB'>, -1, 4000, 0, 0, 0)

code snippet is as follows

from pymongo import MongoClient
from bson import BSON
mongoclient = MongoClient('localhost',27017)
db = mongoclient['XYZ']

oracleConnection = cx_Oracle.connect('xyz/xyz1@dtabase')
oracleCursor = oracleConnection.cursor()
oracleCursor.execute("select review_description from table where id = 49390")
def getRows():
    """ returns cx_Oracle rows as dicts """
    colnames = []
    #rows = []
    for i in oracleCursor.description:
            print i
            colnames.append(i[0])
            print colnames

    for row in oracleCursor:
rows = []
for i in row:
try:
rows.append(i.read())
except:
rows.append(i)
yield dict(zip(colnames, rows))

data = getRows()
for i in data:
        try:
            db.test234.insert(i)
        except Exception, err:
            print sys.exc_info()[0]
            traceback.print_exc()
            quit()

checked many forums  , unable to find the exact solution for the issue, tried options like encoding the clob data which still thrown same exception

Can anyone help me on this to resolve the issue



I am Migrating the data from oracle to Mongo DB using python,  while migrating i am able to read the clob object using clob.read(), but while inserting into mongo DB it is throwing an exception saying 
<class 'bson.errors.InvalidStringData'>
Traceback (most recent call last):
  File "test.py", line 39, in <module>
    db.test234.insert(i)
  File "C:\Python27\lib\site-packages\pymongo\collection.py", line 409, in insert
    gen(), check_keys, self.uuid_subtype, client)
InvalidStringData: strings in documents must be valid UTF-8: 'Malicious Attack Driver\r\n                            -----------------------\r\n\r\nThis is an effort to (Malicious attack
 driver) comprising of wrapper routines to provide test script infrastructure to  run different attack tools,vulnerability scanners,hacker tools such as . The objective
is to provide common APIs across all the protocols which can run the attack/test from a remote.

Oracle data type is 
('REVIEW_DESCRIPTION', <type 'cx_Oracle.CLOB'>, -1, 4000, 0, 0, 0)

code snippet is as follows

from pymongo import MongoClient
from bson import BSON
mongoclient = MongoClient('localhost',27017)
db = mongoclient['XYZ']

oracleConnection = cx_Oracle.connect('xyz/xyz1@dtabase')
oracleCursor = oracleConnection.cursor()
oracleCursor.execute("select review_description from table where id = 49390")
def getRows():
    """ returns cx_Oracle rows as dicts """
    colnames = []
    #rows = []
    for i in oracleCursor.description:
            print i
            colnames.append(i[0])
            print colnames

    for row in oracleCursor:
rows = []
for i in row:
try:
rows.append(i.read())
except:
rows.append(i)
yield dict(zip(colnames, rows))

data = getRows()
for i in data:
        try:
            db.test234.insert(i)
        except Exception, err:
            print sys.exc_info()[0]
            traceback.print_exc()
            quit()

checked many forums  , unable to find the exact solution for the issue, tried options like encoding the clob data which still thrown same exception

Can anyone help me on this to resolve the issue



For what it's worth, I just tried to insert that string using PyMongo, both with and without the extensions and both as a str and a unicode, and no error occurred. There must be something about the data we're not seeing from just the repr in the exception.



Thank you very much for the reply, here i CLOB data which i tried to 
upload to the mongo database it is just String which contains all 
escape characters when we displayed on the console.


"Malicious Attack Driver
                            -----------------------

This is an effort to develop a Tcl package (Malicious attack driver) 
comprising of wrapper routines to provide test script infrastructure to  
run different attack tools,vulnerability scanners,hacker tools such as 
Codenomicon,Nessus etc.. on a. The objective is to provide common 
APIs across all the protocols which can run the attack/test from a 
remote machine connected to the device, and check the health of the  
device after the attack by verifying console responsivesness, multiple 
ping sessions in a loop and comparing the process CPU/Interrupt 
CPU and memory utilization before and after the test. package for 
Codenomicon, a robustness testing tool.  In the future, support will 
added to cover other vulnerability testing tools like Nessus. Initially 
Codenomicon attack pack will be targeted and down the line other 
tools will be added.



Overview of  Codenomicon
-------------------------

Codenomicon is a tool that can be used to test security flaws in the 
protocols
Codenomicon provides automated tools with a systematic approach to 
test the      . The java based tool can simulate numerous protocol 
messages containing exceptional elements simulating malicious 
attacks with various protocols such as  such as TCP, BGP, TLS, 
Radius, Http, Ipv4, Ipv6, UDP, NTP, SSH,GRE,SIP,TACACS etc. 
It has both a GUI and a command line interface. 


Requirements
------------

Attack machine ----------|----------   
     |
     |
     
     |
     |
  ATS machine

The package routines need to connect to the attack machine and 
launch the attacks to the device. This involves control library routines. 
The attack machine requires java and does not work with all jdk 
versions.




                                APIs in the package
                                -------------------

1. mad::init_params  API
========================
This initializing API should be called for to setup test environment 
such as attack type, attack machine, login/password and other 
optional parameters for the whole bunch of tests following it.

Mandatory Parameters:
--------------------
-attack_type    : Type of attack (codenomicon)
-attack machine name     : Name of the machine where codenomicon 
exists
-passwd  : Password to reach attack machine
-javapath                : Path of the java executable.


Optional parameters:
--------------------
-user  : User name, defaults to ýrootý
prompt                  : Shell prompt of the attack machine

Returns:
-------
1= success
0= failure 


2. mad::run_test  API
======================
This API is called to run tests from the protocol suite. This API opens 
Ssh connection to the attack machine and runs a single test or a range
of tests specified. Since jar file options varies widely, "-params" is 
introduced which allows users to specify the parameters. This API will 
determine the initial memory and CPU utilization before starting the 
tests.


Mandatory Parameters:
---------------------
-dutname                : Name of the DUT
-params : parameters supplied to jar file containing the tests

Returns:
--------
0= failure (incorrect parameters in ýparams, unable to connect to 
attack machine)
1= success

The output of the test run can be accessed by mad::testrun_buffer


3. mad::check_health  API
=========================
This API checks the state of the    device after bombarding it with 
attacks. The following is the flow:

1. Check the responsiveness of  device by running ýshow versioný
2. Pings the device from the attack machine ýpingsý times
3. Checks CPU utilization and memory leaks by running show 
process cpu |  include CPU util and show mem | include Processor  
and compare with that run before the attack

Mandatory Parameters:
---------------------
-dutname  : Name of the    DUT
-target_ip : IP address of the interface on DUT
-mem_threshold : % Max allowed increase in memory
-cpu_threshold : % Max allowed increase in CPU utilization
-ping_params            : arguments passed to ping 

Optional Parameters:
--------------------
-pings : Max number of times the DUT is pinged, defaulted to 3

Returns:
--------
0=failure  (Either one of the steps failed)
1=success


4. mad::get_tests  API
======================
This API will return the total number of tests that exist in a protocol 
suite for a given attack 

Mandatory Parameters:
---------------------
-jarfile : The name of the jar file which contain the tests

Returns: 
--------
Total no of tests if success, otherwise 0"


댓글 없음:

댓글 쓰기