Main   Products   Offshore Outsourcing   Customers   Partners   ContactUs  
JDBC Databases
  HXTT Access v7.1
  HXTT Cobol v5.0
  HXTT DBF v7.1
  HXTT Excel v6.1
  HXTT Json v1.0
  HXTT Paradox v7.1
  HXTT PDF v2.0
  HXTT Text(CSV) v7.1
 
  Buy Now
  Support
  Download
  Document
  FAQ
  HXTT Word v1.1
  HXTT XML v4.0
Offshore Outsourcing
Free Resources
  Firewall Tunneling
  Search Indexing Robot
  Conditional Compilation
  Password Recovery for MS Access
  Password Recovery for Corel Paradox
  Checksum Tool for MD5
  Character Set Converter
  Pyramid - Poker of ZYH
   
   
   
Heng Xing Tian Tai Lab of Xi'an City (abbr, HXTT)

HXTT Text(CSV)
HXXT Text jdbc - querying multiple files
Joseph
2006-03-31 08:30:45
Hello,
I am a programmer for a fortune 100 company and I feel that your HXXT Text jdbc driver can be used in our application to access our large ASCII files.
So before I recomend this to pruchase I would like to make sure that it is right fit for application.
We are getting ASCII files as fixed length records on a dialy basis.The nummer of records can be millions.A single file size can range form 500 MB to 1 GB.
What we have to do is to query these files based on various conditions just like a database. The conditions can be for a date range and/or various fields values in the ASCII files.
So it defenitily must search on multiple files at a time to get the required results.
So please let me know whether your driver is capable of doing this in an effective way.

Thanks,
Joseph
Re:HXXT Text jdbc - querying multiple files
HXTT Support
2006-04-02 07:25:56
Yeah. For join query on million level tables, HXTT Text is your last hope. Visit http://www.hxtt.com/text/history.html, you will find some interesting features for your project:
v2.2.008 uses automatic temporary index to quicken subquery sql.
v2.1.125 uses automatic temporary index to quicken join sql.
v2.1.125 provides automatic temporary index feature option.
v2.1.122 provides universal table declaration for a series of tables with the same table structure.
v2.1.122 optimizes IN, NOT IN, ALL, and ANY on subquery.
v2.1.111 supports seamlessly files and directories in ZIP and GZIP file formats(ZIP,JAR,TAR) in jdbc url and sql.
v2.1.101 provides readonly mode for speed optimization.
v2.1.92 supports readonly database, for instance, CD and DVD.

http://www.hxtt.com/test/text/barcelona.html shows a join query speed on five tables. I'm pasting a simple join speed test on plain text for three million level tables:
/* A simple test for JOIN on three tables(YM_INV 5000000 rows(1972MB), YM_DOC 1000000 rows(404MB), YM_PAT 500000 rows (202MB))
Tested with AMD2500 on 2006-04-02
SQL:select count(*) from YM_INV this_ join YM_DOC yorkmedica2_ on this_.DOCTOR=yorkmedica2_.CODE join YM_PAT yorkmedica3_ on this_.ID=yorkmedica3_.ID
Row count:5000000
Spent time:192516ms
SQL:select count(*) from YM_INV this_ join YM_DOC yorkmedica2_ on this_.DOCTOR=yorkmedica2_.CODE join YM_PAT yorkmedica3_ on this_.ID=yorkmedica3_.ID
Row count:5000000
Spent time:120750ms

*/
import java.sql.*;
import java.util.Properties;

public class testText16{
public static void main(String argv[]){
try{
Class.forName("com.hxtt.sql.text.TextDriver");


String url="jdbc:text:////d:/test";

Properties properties=new Properties();

properties.setProperty("fileExtension", "txt");

Connection con = DriverManager.getConnection(url,properties);

Statement stmt = con.createStatement(ResultSet.TYPE_SCROLL_INSENSITIVE,ResultSet.CONCUR_READ_ONLY);
String sql="create table YM_INV (doctor char(10),id char(10),status varchar (2), notsure varchar (7), lastName varchar (20), firstName varchar (11), gender varchar (1), dob varchar (8), HealthCardNum varchar (12), healthCardversionNum varchar (2), number varchar (8), comments varchar (20), streetAdressLine1 varchar (30), "
+"city varchar (15), province varchar (3), postalCode varchar (6), homePhone varchar (10), workPhone varchar (10), secondAddress varchar (81), dunno varchar (49), dunno1 varchar (4), due varchar (8), due2 varchar (8), patKey2 varchar (3), morepadding varchar (16), flag varchar (1), eor varchar (35),"
+"tempPK char (14)); "
+"create table YM_DOC (code char(10),name char(20),status varchar (2), notsure varchar (7), lastName varchar (20), firstName varchar (11), gender varchar (1), dob varchar (8), HealthCardNum varchar (12), healthCardversionNum varchar (2), number varchar (8), comments varchar (20), streetAdressLine1 varchar (30), "
+"city varchar (15), province varchar (3), postalCode varchar (6), homePhone varchar (10), workPhone varchar (10), secondAddress varchar (81), dunno varchar (49), dunno1 varchar (4), due varchar (8), due2 varchar (8), patKey2 varchar (3), morepadding varchar (16), flag varchar (1), eor varchar (35),"
+"tempPK char (14));"
+"create table YM_PAT (id char(10),name char(20),status varchar (2), notsure varchar (7), lastName varchar (20), firstName varchar (11), gender varchar (1), dob varchar (8), HealthCardNum varchar (12), healthCardversionNum varchar (2), number varchar (8), comments varchar (20), streetAdressLine1 varchar (30), "
+"city varchar (15), province varchar (3), postalCode varchar (6), homePhone varchar (10), workPhone varchar (10), secondAddress varchar (81), dunno varchar (49), dunno1 varchar (4), due varchar (8), due2 varchar (8), patKey2 varchar (3), morepadding varchar (16), flag varchar (1), eor varchar (35),"
+"tempPK char (14));"
;
stmt.executeUpdate(sql);

boolean insertDataFlag=false;
if(insertDataFlag){
sql="insert into YM_INV (doctor,id) values(?,?);";
PreparedStatement pstmt = con.prepareStatement(sql);
for(int i=0;i<5000000;i++){
pstmt.setInt(1,i%2);
pstmt.setInt(2,i%10);
pstmt.executeUpdate();
}
pstmt.close();

sql="insert into YM_DOC (code,name) values(?,?);";
pstmt = con.prepareStatement(sql);
for(int i=0;i<1000000;i++){
pstmt.setInt(1,i);
pstmt.setString(2,"doc"+i);
pstmt.executeUpdate();
}
pstmt.close();

sql="insert into YM_PAT (id,name) values(?,?);";
pstmt = con.prepareStatement(sql);
for(int i=0;i<500000;i++){
pstmt.setInt(1,i);
pstmt.setString(2,"patient"+i);
pstmt.executeUpdate();
}
pstmt.close();
}




String[] sqls=new String[]{
"select count(*) from YM_INV this_ join YM_DOC yorkmedica2_ on this_.DOCTOR=yorkmedica2_.CODE join YM_PAT yorkmedica3_ on this_.ID=yorkmedica3_.ID",
"select count(*) from YM_INV this_ join YM_DOC yorkmedica2_ on this_.DOCTOR=yorkmedica2_.CODE join YM_PAT yorkmedica3_ on this_.ID=yorkmedica3_.ID",
};

for(int i=0;i System.out.println("SQL:" +sqls[i]);
long currentTime=System.currentTimeMillis();
ResultSet rs = stmt.executeQuery(sqls[i]);
rs.beforeFirst();
while (rs.next()) {
System.out.println("Row count:" +rs.getObject(1));
}
currentTime=System.currentTimeMillis()-currentTime;

rs.close();
System.out.println("Spent time:" +currentTime+"ms" );
}

stmt.close();

con.close();


}
catch( SQLException sqle )
{
do
{
System.out.println(sqle.getMessage());
System.out.println("Error Code:"+sqle.getErrorCode());
System.out.println("SQL State:"+sqle.getSQLState());
sqle.printStackTrace();
}while((sqle=sqle.getNextException())!=null);
}
catch( Exception e )
{
System.out.println(e.getMessage());
e.printStackTrace();
}
}
}



Re:Re:HXXT Text jdbc - querying multiple files
Joseph
2006-04-04 08:14:18
Thanks for your detailed input.
I have recommended this to my manager.
I will get back to you when I hear from him.
Thanks,
Joseph

Search Key   Search by Last 50 Questions




Google
 

Email: webmaster@hxtt.com
Copyright © 2003-2019 Heng Xing Tian Tai Lab of Xi'an City. | All Rights Reserved. | Privacy | Legal | Refund | Sitemap