Processing Large Dataset - Class Memory Overhead
Apr 7, 2015
I've been tasked with processing a large dataset as part of a class assignment. One of the fields is a 24-digit unsigned hex number. I realized that, rather than storing the field verbatim in a char array of length 24, I could store the actual value of the hex number in an array only 6 chars long (I chose char over int because chars are unsigned). To do this, I wrote the following simple class to accept the hex string and convert it so that it can be stored in that manner:
private class Hex24
{
private final char[] hexAsInt = new char[6];
public Hex24(String h)
{
for(int index = 0; index < 6; ++index)
hexAsInt[index] = (char)Integer.parseUnsignedInt(h.substring(index * 4, (index + 1) * 4));
[Code] ....
Assuming all this works (haven't tested it, but I think it should), what I'm wondering is how much memory this will save me (if any) compared to just throwing everything into a char[24] array. The underlying char[6] is obviously quite a bit smaller, but objects must take up more space than just their fields since Java needs to know what kind of object it is so it can know what methods it has, etc..How to accurately compare the size of a Hex24 object to the size of a 24 character array.
I guess another option/thing I might want to compare size efficiency for is putting the char[6] along with the conversion/comparison logic directly into the classes (as fields/member methods) where I'm currently using Hex24 fields. I'd guess this is the most memory-efficient option I've come up with, but it would lead to a lot of code duplication.
Another thing I'd like to compare is the size of a String vs. the size of the equivalent char array for shortish text fields. If this difference is big enough it might be worth storing those fields as arrays rather than Strings.
View Replies
Feb 21, 2014
I have a primefaces datatable with about 52000 records to be fetched.Since it is a large dataset,i tried using live scrolling feature of primefaces with scroll rows equal to 20.THe number of columns is 53.The table also has filtering and sorting feature on its each column.Still i am not satisfied with the performance of the table.It takes about 15 secs for the page to load,worst thing is that it takes about 65 secs for the next set of 20 records to be loaded on reaching the end of scrolling.
Just for testing i reduced the total number of records to 25000 and the preformance improves with scroll time of 29 secs.I am really not able to understand why it is taking this much time when i am displaying only 20 records at a time.The total number of records should not have affected the performance.
My JSF code snippet
<p:dataTable id="arcRecList" var="data"
value="#{archivedRecordBean.archiveItems}"
tableStyle="table-layout:auto; width:80%;" styleClass="datatable"
scrollable="true" scrollWidth="84%" scrollHeight="69%"
columnClasses="columnwidth" liveScroll="true" scrollRows="20"
filteredValue="#{archivedRecordBean.filteredArchiveItems}">
[Code] ....
View Replies
View Related
Sep 30, 2014
I have a question regarding best practice in using local variables as my method return variable. I have a method like this:
myReturnObject getMyObject(String input) {
myReturnObject myObject = null;
try {
myObject = helperObject.someOtherMethod().getObject(input); //getObject has return type myReturnObject
} catch (Exception e) {
//log any problems
}
return myObject;
}
And I'm wondering if I rewrite like this if I'll see some performance optimization benefit:
myReturnObject getMyObject(String input) {
try {
return helperObject.someOtherMethod().getObject(input); //getObject has return type myReturnObject
} catch (Exception e) {
//log any problems
}
return null;
}
myObject can be quite large -- so I'm wondering if I can omit the myReturnObject local variable instance if it'll save some work from the garbage collector.
View Replies
View Related
Mar 6, 2014
i have to write more than 100000 rows in a excel sheet (file size more than 20 MB) via java.
when i use XSSF, i am getting below Error.
java.lang.OutOfMemoryError: Java heap space
at org.apache.xmlbeans.impl.store.Saver$TextSaver.resize(Saver.java:1592)
at org.apache.xmlbeans.impl.store.Saver$TextSaver.preEmit(Saver.java:1223)
at org.apache.xmlbeans.impl.store.Saver$TextSaver.emit(Saver.java:1144)
[Code]....
when i use HSSF , i am getting the below Error.
java.lang.OutOfMemoryError: Java heap space
I have tried increasing the java heap size , by giving upto -Xms1500m -Xmx2048m
View Replies
View Related
Jan 27, 2014
Since the Paint methods are being executed anytime a frame is being moved or resized, how much overhead could an application incur if the application
- has many frames
- doing a lot of moving
- doing a lot of resizing
- has a lot of background processes running.
As a rule, I try to only keep the most essential code in these types of methods and wondering if that type of thinking is "old school".
How much overhead is actually involved? If you needed to monitor the size of a given frame anytime the user resizes it, would you be concerned that using these methods would incur too much overhead?
View Replies
View Related
May 25, 2014
I am new to Java/OOP in general, and am trying to implement a multi-threaded system that contains a master thread, and a set of worker threads that are heterogeneous in the work they do. Once they complete the work, the workers indicate to the master by posting the result on to its queue. Here is the problem. The results of each type of work is different, and the master has to process each differently. In C (which I'm familiar with), this can be achieved by having a message type that is a union of all the expected messages, and by using a switch statement.
I thought of doing something similar in Java, by using instance of on each incoming message (each individual message class having been subclassed from a super message class) , and doing switch on that, but it doesn't seem to be the OO way to do things. The only other way I could think of was to implement an abstract method to get the type of each message, and then use the type in a switch statement, or if-then-else. Is there some other Java idiom to do this kind of processing? Also, if this is an acceptable method, why is it superior to using the reflection to find out the message type (instead of using the abstract getType())?
The message types look similar to the code below:
abstract class Message {
abstract String getType();
} class Result1 extends Message {
ResultType1 content;
String getType() {
[Code] ....
View Replies
View Related
Feb 13, 2015
I am trying to process an array of pixels, checking each within a 5*5 mask. Ive declared my mask as a 2d array and get the pixel values within a nested for loop. Im getting an ArrayIndexOutOfBoundsException at the line:
mask[0][4] = dsIm.red[i-1][j+3];
Have I declared my mask wrong for a 5*5 mask. Is my logic wrong?
I won't give all the code, but basically Im trying to check all 25 pixel values within the mask:
int [][] mask = new int[5][5];
System.out.println("We are in the method just written");
//loop one loop
for(int i = 1; i <= h-2; i++)
{
for(int j = 1; j<=w-2; j++)
[Code] .....
View Replies
View Related
Mar 3, 2013
I am looking for the ability, on the server side, to run programs or "jobs" in a job queue, where the jobs are processed as first in first out. If you are familiar with the IBM iSeries, they have a built in job queue mechanism which accomplishes what I am looking for. The primary purposes for this would be to process and update large amounts of data in a thread safe environment.
View Replies
View Related
May 9, 2014
It seems we have abandoned Dice/Die(s), and are now working on something completely foreign. I don't even have a code to start with because I haven't the faintest clue what is going on here (no notes given on this topic, as usual). We are given 4 half-written programs to work with.
The instructions are:
"Examine the FormLetterEntry abstract class, and create the two derived classesTextEntry and DataItemEntry. Be sure to implement all the abstract methods in each derived class."
This is the code we were given:
package homework5;
import java.util.Properties;
/**
* Abstract class representing the entries in a form letter.
**/
public abstract class FormLetterEntry {
/**
* Retrieve the template string for this entry
* @return the value of this entry in a template
**/
[code].....
I understand (and correct me if I am wrong) that a derived class is a class that is created from a base class via inheritance. What I don't see are any notes on how to write a derived class. I see some notes online on how to do so, but they don't fit with what he's written above. What he means by "Be sure to implement all the abstract methods in each derived class."
View Replies
View Related
Oct 9, 2013
class mmm {
public static void main(String[] args) {
OcrEngine ocr = new OcrEngine();
ocr.setImage(ImageStream.fromFile("pp.tif"));
// ocr.setImage((IImageStream) new File("pp.tif"));
[Code] ....
In above code giving error .
Exception in thread "main" com.aspose.ms.System.IO.FileNotFoundException: Can't find file: pp.tiff.
Where i am wrong. I am trying to pass all formt(.png,.gif,.jpg) images. I am giving proper path ....
View Replies
View Related
Oct 8, 2014
I am writing a program that solves sudoku puzzles. It has a Swing GUI with a "solve" button. I have written an actionListener class with an actionPerformed method that is invoked when the user presses this button. It is the following:
public void actionPerformed(ActionEvent event) {
try {
worker = new SudokuSolveWorker();
worker.addPropertyChangeListener(this);
worker.execute();
SolveFrame sf = new SolveFrame();
} catch (Exception exc) {
System.out.println(exc.getMessage());
exc.printStackTrace();
}
}
The code creates a worker, a new instance of SudokuSolveWorker. The worker.execute() statement causes the doInBackground() method of this class to be called. This solves the Sudoku. A property change listener "picks up" the result:
public void propertyChange(PropertyChangeEvent event) {
try {
System.out.println("" + javax.swing.SwingUtilities.isEventDispatchThread());
SudokuModel solvedModel = (SudokuModel) worker.get();
} catch (Exception exc) {
System.out.println(exc.getMessage());
exc.printStackTrace();
}
}
As I wrote, this works without freezing up the user interface in the sense that the program seems unstable. However, the user interface does not respond to "commands" (mouse clicks) anymore until the worker thread has finished.
In the first code fragment I create an instance of SolveFrame which is an extension of a JFrame. It is a simple frame with a "cancel" button. It is drawn on the screen, even though it is called after the worker.execute() statement. I'd like the user to be able to click this "cancel" button, after which the solving of the sudoku puzzle should be stopped. However, since the program does not respond to mouse clicks anymore, the "cancel" button cannot be pressed.
View Replies
View Related
Feb 18, 2014
I want to develop a Java EE application for the following scenario.
Webapp takes a file from a user and analyze the file. This analysis could take hours. User should be able to check if the analysis is finished via AJAX. When the analysis is finished user should be able to view the analysis report that has been generated by the analyzer.
I checked what are the possibles ways I could achieve this but couldn't get a clear idea. I heard about JMS, Work Manager API and servlet asynchronous processing. But still not sure what to use and how to use.I'm not very much familiar with EJB.
View Replies
View Related
Dec 11, 2014
I have a requirement where we have an ETL jobs that runs in the backend which will get the up to date data from various sources , here we have both structured and unstructured data and the data capacity is massive which is in petabytes. Based on the data that is fetched we need to come up with a consolidated data view for the data which we retrieved. My queries are:-
1) How to create a consolidate data view , in what form it will be in (XML, JSON, etc)? whether we need to come up with a unified data model as the data is both structured and unstructured format? If so how we can go about it?
2) Since the data is really massive, how we can construct the consolidate data view? whether we need to create it in small chunks or whether we need to store these data in a cache or DB from where we will retrieve it and form the consolidated data view?
3) To process these data whether we need to have any Hadoop clusters to process these data parallel? As we are talking about massive data in the form of structured and un structured format?
4) Whether there is a need for NoSQL database to support unstructured data? If we are supposed to store the data?
View Replies
View Related
Jun 9, 2014
The problem I am trying to solve relates to block-matching or image-within-image recognition. This is an algorithm problem I am working on in java, but computationally, my computer can't handle generating all the combinations at one time.
I see an image, extract the [x,y] of every black pixel and create a set for that image, such as
{[8,0], [9,0], [11,0]}
The set is then augmented so that the first pixel in the set is at [0,0], but the relationship of the pixels is preserved. For example, I see {[8,0], [9,0]} and change the set to {[0,0], [1,0]}. The point of the extraction is that now if I see {[4,0], [5,0]}, I can recognize that basic relationship as two vertically adjacent pixels, my {[0,0], [1,0]}, since it is the same image but only in a different location.
I have a list of these pixel sets, called "seen images". Each 'seen image' has a unique identifier, that allows it to be used as a nested component of other sets. For example:
{[0,0], [1,0]} has the identifier 'Z'
So if I see:
{[0,0], [1, 0], [5,6]}
I can identify and store it as:
{[z], [5, 6]}
The problem with this is that I have to generate every combination of [x,y]'s within the pixel set to check for a pattern match, and to build the best representation. Using the previous example, I have to check:
{[0,0], [1,0]},
{[0,0], [5,6]},
{[1,0], [5,6]} which is {[0,0], [4,5]}
{[0,0], [1,0], [5,6]}
And then if a match occurs, that subset gets replaced with it's ID, merged with the remainder of the original set, and the new combination needs to be checked if it is a 'seen image':
{[z],[5, 6]}
The point of all that is to match as many of the [x,y]'s possible, using the fewest pre-existing pieces as components, to represent a newly seen image concisely. The greedy solution to get the component that matches the largest subset is not the right one. Complexity arises in generating all of the combinations that I need to check, and then the combinations that spawn from finding a match, meaning that if some match and swap produces {[z], [1,0], [2,0]}, then I need to check (and if matched, repeat the process):
{[z], [1,0]}
{[z], [2,0]}
{[1,0], [2,0]} which is {[0,0], [1,0]}
{[z], [1,0], [2,0]}
Currently I generate the pixel combinations this way (here I use numbers to represent pixels 1 == [x,y]) Ex. (1, 2, 3, 4): Make 3 lists:
1.) 2.) 3.)
12 23 34
13 24
14
Then for each number, for each list starting at that number index + 1, concatenate the number and each item and store on the appropriate list, ex. (1+23) = 123, (1+24) = 124
1.) 2.) 3.)
12 23 34
13 24
14
---- ---- ----
123 234
124
134
So those are all the combinations I need to check if they are in my 'seen images'. This is a bad way to do this whole process. I have considered different variations / optimizations, including once the second half of a list has been generated (below the ----), check every item on the list for matches, and then destroy the list to save space, and then continue generating combinations. Another option would be to generate a single combination, and then check it for a match, and somehow index the combinations so you know which one to generate next.
How to optimize what I am doing for a set of ~million items. I also have not yet come up with a non-recursive or efficient way to handle that each match generates additional combinations to check.
View Replies
View Related
Oct 28, 2014
So the assignment is as follows. Develop a new class called BankAccount. A bank account has the owner's name and balance. Be sure to include a constructor that allows the client to supply the owner's name and initial balance. A bank account needs - accessors for the name and balance, mutators for making deposits and withdrawals. I have the following code :
import java.util.Scanner;
public class BankAccount{
public static void main(String [] args){
Scanner reader = new Scanner(System.in);
double name;
double balance;
double deposit;
double withdrawl;
[Code] ....
I am having trouble with my if statements. I don't know how to link the number 1 & 2 keys to deposit and withdrawal actions. Plus I am supposed to have a while loop yet don't know how to implement this so that the while loop will ask the user if they would like to make another transaction after either depositing or withdrawing.
View Replies
View Related
Jan 21, 2015
I have observed a strange behaviour from Resultset object. My application fetches 400 records from a table and processes these records every 10secs. By default the resultset has a fetchsize of 10 from the database cursor. As I understand if the query returns 400records, the resultset will fetch 40times, in multiple of 10 to get all these 400 records from database cursor.
Query : SELECT * FROM ( SELECT * FROM TestTable
WHERE STATE_ACTION = 0 ORDER BY ROP_TIME DESC )
WHERE ROWNUM <= 250
Observation : Under a normal operation, the resultset fetches all the 400 records on query execution from database cursor, but under unknown conditions the same resultset object fetches only 10 records from the database cursor and exits. Please refer page 297 in document below for the result fetch size details. JDBC developer guide for oracle 10g : [URL] .... This condition self-heals itself in few hours or restarting the database or restarting the server. The root cause of this behavior is unknown.
View Replies
View Related