java


Java Reading large files into byte array chunk by chunk


So I've been trying to make a small program that inputs a file into a byte array, then it will turn that byte array into hex, then binary. It will then play with the binary values (I haven't thought of what to do when I get to this stage) and then save it as a custom file.
I studied a lot of internet code and I can turn a file into a byte array and into hex, but the problem is I can't turn huge files into byte arrays (out of memory).
This is the code that is not a complete failure
public void rundis(Path pp) {
byte bb[] = null;
try {
bb = Files.readAllBytes(pp); //Files.toByteArray(pathhold);
System.out.println("byte array made");
} catch (Exception e) {
e.printStackTrace();
}
if (bb.length != 0 || bb != null) {
System.out.println("byte array filled");
//send to method to turn into hex
} else {
System.out.println("byte array NOT filled");
}
}
I know how the process should go, but I don't know how to code that properly.
The process if you are interested:
Input file using File
Read the chunk by chunk of the file into a byte array. Ex. each byte array record hold 600 bytes
Send that chunk to be turned into a Hex value --> Integer.tohexstring
Send that hex value chunk to be made into a binary value --> Integer.toBinarystring
Mess around with the Binary value
Save to custom file line by line
Problem:: I don't know how to turn a huge file into a byte array chunk by chunk to be processed.
Any and all help will be appreciated, thank you for reading :)
To chunk your input use a FileInputStream:
Path pp = FileSystems.getDefault().getPath("logs", "access.log");
final int BUFFER_SIZE = 1024*1024; //this is actually bytes
FileInputStream fis = new FileInputStream(pp.toFile());
byte[] buffer = new byte[BUFFER_SIZE];
int read = 0;
while( ( read = fis.read( buffer ) ) > 0 ){
// call your other methodes here...
}
fis.close();
To stream a file, you need to step away from Files.readAllBytes(). It's a nice utility for small files, but as you noticed not so much for large files.
In pseudocode it would look something like this:
while there are more bytes available
read some bytes
process those bytes
(write the result back to a file, if needed)
In Java, you can use a FileInputStream to read a file byte by byte or chunk by chunk. Lets say we want to write back our processed bytes. First we open the files:
FileInputStream is = new FileInputStream(new File("input.txt"));
FileOutputStream os = new FileOutputStream(new File("output.txt"));
We need the FileOutputStream to write back our results - we don't want to just drop our precious processed data, right? Next we need a buffer which holds a chunk of bytes:
byte[] buf = new byte[4096];
How many bytes is up to you, I kinda like chunks of 4096 bytes. Then we need to actually read some bytes
int read = is.read(buf);
this will read up to buf.length bytes and store them in buf. It will return the total bytes read. Then we process the bytes:
//Assuming the processing function looks like this:
//byte[] process(byte[] data, int bytes);
byte[] ret = process(buf, read);
process() in above example is your processing method. It takes in a byte-array, the number of bytes it should process and returns the result as byte-array.
Last, we write the result back to a file:
os.write(ret);
We have to execute this in a loop until there are no bytes left in the file, so lets write a loop for it:
int read = 0;
while((read = is.read(buf)) > 0) {
byte[] ret = process(buf, read);
os.write(ret);
}
and finally close the streams
is.close();
os.close();
And thats it. We processed the file in 4096-byte chunks and wrote the result back to a file. It's up to you what to do with the result, you could also send it over TCP or even drop it if it's not needed, or even read from TCP instead of a file, the basic logic is the same.
This still needs some proper error-handling to work around missing files or wrong permissions but that's up to you to implement that.
A example implementation for the process method:
//returns the hex-representation of the bytes
public static byte[] process(byte[] bytes, int length) {
final char[] hexchars = "0123456789ABCDEF".toCharArray();
char[] ret = new char[length * 2];
for ( int i = 0; i < length; ++i) {
int b = bytes[i] & 0xFF;
ret[i * 2] = hexchars[b >>> 4];
ret[i * 2 + 1] = hexchars[b & 0x0F];
}
return ret;
}
Edit2 small code typo

Related Links

Cannot create .exe with JavaFx
Truble with Scanner and looping in java [duplicate]
Initializing new array as function argument, in a loop, in Java. Performance
Inheritance and Overloading methods with different argument data types in Java
Which is better for processing a huge file in Java - XML or Serialized file?
Print colored log messages
Method invocation instruction (invokevirtual/invokestatic) is substituted by some unexpected instructions [duplicate]
Make immuteable third party classes
Building Maven 3 multimodule interdependent sibling project with dependencies - dependencies not resolved
How to take an array of n integers delimited by white spaces and store them in an array?
Parse.com not currectly subscribe android
Incorrect date value: '' for column 'DATESORTIE' at row 1
How to reliably accept time input in a Spring form?
How to read write xml file in Google App Engine Project?
How to share an object between threads to demonstrate that it is not thread-safe?
IntelliJ IDEA 14: How to skip tests while deploying project into Tomcat

Categories

HOME
webpack
fluentd
framework7
gis
razor
podio
virtualization
communication
ssl-client-authentication
phaser
percona
workload-scheduler
opentracing
openrefine
facebook-page
flux
jqwidget
jquery-ajaxq
css-animations
cultureinfo
vlsi
qhull
su
fabric8
galsim
azure-sql-database
wallpaper
force-layout
stacked
git-merge
qwerty
swisscomdev
sqlite2
hue
webdriver-manager
squib
dism
ibpy
import-from-excel
modelmapper
sas-jmp
azure-application-gateway
komodoedit
pnotify
python-webbrowser
idisposable
android-cursor
color-profile
google-cdn
abcpdf9
slicknav
pcf
fuzzy-search
make-install
dlna
ptrace
thredds
retina
fouc
quicklisp
wyam
dukescript
emailrelay
kineticjs
sorl-thumbnail
jsonpickle
cctv
xcode-6.2
google-reader
mesa
starcluster
fluentautomation
braille
stxxl
.aspxauth
ora-00911
referrer
trusted
quickdialog
automount
viewswitcher
removeclass
bigcouch
libc++
coredump
lpeg
cxxtest
google-friend-connect
revisions
dentrix
meego
nintendo-ds
data-driven
swing-app-framework
dbisam
django-notification

Resources

Mobile Apps Dev
Database Users
javascript
java
csharp
php
android
MS Developer
developer works
python
ios
c
html
jquery
RDBMS discuss
Cloud Virtualization
Database Dev&Adm
javascript
java
csharp
php
python
android
jquery
ruby
ios
html
Mobile App
Mobile App
Mobile App