java


Java Reading large files into byte array chunk by chunk


So I've been trying to make a small program that inputs a file into a byte array, then it will turn that byte array into hex, then binary. It will then play with the binary values (I haven't thought of what to do when I get to this stage) and then save it as a custom file.
I studied a lot of internet code and I can turn a file into a byte array and into hex, but the problem is I can't turn huge files into byte arrays (out of memory).
This is the code that is not a complete failure
public void rundis(Path pp) {
byte bb[] = null;
try {
bb = Files.readAllBytes(pp); //Files.toByteArray(pathhold);
System.out.println("byte array made");
} catch (Exception e) {
e.printStackTrace();
}
if (bb.length != 0 || bb != null) {
System.out.println("byte array filled");
//send to method to turn into hex
} else {
System.out.println("byte array NOT filled");
}
}
I know how the process should go, but I don't know how to code that properly.
The process if you are interested:
Input file using File
Read the chunk by chunk of the file into a byte array. Ex. each byte array record hold 600 bytes
Send that chunk to be turned into a Hex value --> Integer.tohexstring
Send that hex value chunk to be made into a binary value --> Integer.toBinarystring
Mess around with the Binary value
Save to custom file line by line
Problem:: I don't know how to turn a huge file into a byte array chunk by chunk to be processed.
Any and all help will be appreciated, thank you for reading :)
To chunk your input use a FileInputStream:
Path pp = FileSystems.getDefault().getPath("logs", "access.log");
final int BUFFER_SIZE = 1024*1024; //this is actually bytes
FileInputStream fis = new FileInputStream(pp.toFile());
byte[] buffer = new byte[BUFFER_SIZE];
int read = 0;
while( ( read = fis.read( buffer ) ) > 0 ){
// call your other methodes here...
}
fis.close();
To stream a file, you need to step away from Files.readAllBytes(). It's a nice utility for small files, but as you noticed not so much for large files.
In pseudocode it would look something like this:
while there are more bytes available
read some bytes
process those bytes
(write the result back to a file, if needed)
In Java, you can use a FileInputStream to read a file byte by byte or chunk by chunk. Lets say we want to write back our processed bytes. First we open the files:
FileInputStream is = new FileInputStream(new File("input.txt"));
FileOutputStream os = new FileOutputStream(new File("output.txt"));
We need the FileOutputStream to write back our results - we don't want to just drop our precious processed data, right? Next we need a buffer which holds a chunk of bytes:
byte[] buf = new byte[4096];
How many bytes is up to you, I kinda like chunks of 4096 bytes. Then we need to actually read some bytes
int read = is.read(buf);
this will read up to buf.length bytes and store them in buf. It will return the total bytes read. Then we process the bytes:
//Assuming the processing function looks like this:
//byte[] process(byte[] data, int bytes);
byte[] ret = process(buf, read);
process() in above example is your processing method. It takes in a byte-array, the number of bytes it should process and returns the result as byte-array.
Last, we write the result back to a file:
os.write(ret);
We have to execute this in a loop until there are no bytes left in the file, so lets write a loop for it:
int read = 0;
while((read = is.read(buf)) > 0) {
byte[] ret = process(buf, read);
os.write(ret);
}
and finally close the streams
is.close();
os.close();
And thats it. We processed the file in 4096-byte chunks and wrote the result back to a file. It's up to you what to do with the result, you could also send it over TCP or even drop it if it's not needed, or even read from TCP instead of a file, the basic logic is the same.
This still needs some proper error-handling to work around missing files or wrong permissions but that's up to you to implement that.
A example implementation for the process method:
//returns the hex-representation of the bytes
public static byte[] process(byte[] bytes, int length) {
final char[] hexchars = "0123456789ABCDEF".toCharArray();
char[] ret = new char[length * 2];
for ( int i = 0; i < length; ++i) {
int b = bytes[i] & 0xFF;
ret[i * 2] = hexchars[b >>> 4];
ret[i * 2 + 1] = hexchars[b & 0x0F];
}
return ret;
}
Edit2 small code typo

Related Links

Java: How do I make an array and fill it with objects in my constructor?
Fix code that moves contacts
Grails MongoDB Replica set : Automatic failover not taking place
can i add a CSS file to JFrame?
Maven build Compilation error : Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) on project Maven
Android TableRow equal cell spacing not working
What is the meaning of E/OpenGLRenderer: saveGfxinfoFileDisabled - enabled showing up too many times in logs
Using java to generate WSDL with given parameters
How to use If condition in Button intent call?
Java Socket: Server won't read input until client server is closed -> Server can't response to client
Calculating Divisors & Outputting a String to a JLabel
Java Error: Exception in thread “Thread-5423” java.lang.ArrayIndexOutOfBoundsException [duplicate]
How can I host a LAN server on an Android device? [closed]
Best way to store sstring and translations in a program
Spring Boot to start with Oracle Configuration
Spring MVC AccessDeniedException 500 error received instead of custom 401 error for #PreAuthorized unauth requests

Categories

HOME
ionic-framework
minimum-spanning-tree
plone
homebrew
electron
sd-card
rubygems
ezpublish
slick-slider
facebook-php-sdk
rascal
php-7.1
orchardcms
saxon
database-replication
flux
firefox-webextensions
lucene.net
dbext
sox
dcevm
mustache.php
sparse-matrix
info.plist
pingfederate
social-media
maxmind
saas
directx-10
y86
cloud-code
suricata
dartium
scaffold
c11
xenforo
opshub
eclipse-scout
http-live-streaming
acoustics
sas-jmp
qsslsocket
pnotify
aurelia-validation
mplayer
nativeapplication
sqlclient
lift-json
libusb-win32
zendesk-app
gcsfuse
sqldf
minimization
root-framework
pcf
holder.js
pyke
photobucket
energy
tcpserver
metaclass
wireshark-dissector
android-listview
createprocessasuser
dia
gwidgets
mdt
googlemock
zend-route
jubula
kgdb
jboss-weld
cloud-connect
page-layout
quartz-graphics
gwt-rpc
html4
seed
random-seed
xamlparseexception
coderush
google-email-migration
ocx
getmessage
custom-backend
stage
doh
appender
cxxtest
sortable-tables
.nettiers
ext3
backcolor
eqatec
thread-local-storage

Resources

Mobile Apps Dev
Database Users
javascript
java
csharp
php
android
MS Developer
developer works
python
ios
c
html
jquery
RDBMS discuss
Cloud Virtualization
Database Dev&Adm
javascript
java
csharp
php
python
android
jquery
ruby
ios
html
Mobile App
Mobile App
Mobile App