R Package with S4 Objects

When performing R packages, Linux may be the best choice because R packaging in Windows implies to install a lot of base Linux applications and tools on Windows. For those that wish to go that road, here’s what one has to do: read Making tutorial about R Packages Under Windows: A Tutorial by Peter Rossi.
For everyone else, just boot your Linux.

Before starting, one should take a look at Writting R Extensions. This is the first step for all that wish to build R extensions. A fast and easy way to know how to pack, is to read An Introduction to the R package mechanism.

When developing using S4 objects, packaging may become a big headache.
I’ve suffered this headache. I had all kinds of warnings and errors during the package tests and installation.

I’ve started by posting to the R-Help list asking for help. Here’s the post transcript:

Here's what I do:

1. in R console, I do and get:
> package.skeleton(name='remora')remora-package
Creating directories ...
Creating DESCRIPTION ...
Creating Read-and-delete-me ...
Saving functions and data ...
Making help files ...
Done.
Further steps are described in './remora/Read-and-delete-me'.
Warning messages:
1: In dump(internalObjs, file = file.path(code_dir,
sprintf("%s-internal.R",  :
deparse of an S4 object will not be
source()able
2: In dump(internalObjs, file = file.path(code_dir,
sprintf("%s-internal.R",  :
deparse of an S4 object will not be
source()able
3: In dump(internalObjs, file = file.path(code_dir,
sprintf("%s-internal.R",  :
deparse of an S4 object will not be
source()able
4: In dump(internalObjs, file = file.path(code_dir,
sprintf("%s-internal.R",  :
deparse of an S4 object will not be source()able

I don't know why I get these warnings. 
I've followed R implementation rules and the S4 objects work fine.
2. Performing the 'R CMD build remora' command I get:

* checking for file 'remora/DESCRIPTION' ... OK
* preparing 'remora':
* checking DESCRIPTION meta-information ... OK
* removing junk files
* checking for LF line-endings in source and make files
* checking for empty or unneeded directories
* building 'remora_1.0.tar.gz'

And the remora_1.0.tar.gz file seems ok.
3. Performing the 'R CMD check remora' command I get:

* checking for working pdflatex ...sh: pdflatex: not found
NO
* checking for working latex ...sh: latex: not found
NO
* using log directory '/home/fmm/thesis/R/src/remora.Rcheck'
* using R version 2.8.1 (2008-12-22)
* using session charset: UTF-8
* checking for file 'remora/DESCRIPTION' ... OK
* checking extension type ... Package
* this is package 'remora' version '1.0'
* checking package dependencies ... OK
* checking if this is a source package ... OK
* checking for executable files ... OK
* checking whether package 'remora' can be installed ...
ERROR
Installation failed.
See '/home/fmm/thesis/R/src/remora.Rcheck/00install.out'
for details.
4. the log file contains:

* Installing *source* package 'remora'
...
**
R

**
data

** preparing package for lazy
loading
Error in parse(n = -1, file = file) : unexpected '-> code2LazyLoadDB
-> sys.source -> parse Execution halted
ERROR: lazy loading failed for package
'remora'
** Removing
'/home/fmm/thesis/R/src/remora.Rcheck/remora'
fmm@Darkmaster:~/thesis/R/src$ grep -i __C__remoraConfiguration *
fmm@Darkmaster:~/thesis/R/src$ grep -i __C__remoraConfiguration */*
remora.Rcheck/00install.out:745: `.__C__remoraConfiguration`

But, unfortunately, no help came from there…

In the quest for the solution I’ve found out that many others were having similar problems with S4 object packaging, but no solutions were provided.
After a lot investigation, I’ve finally found it and the solution is actually quite easy.
Instead of packing it from the current environment, I’ve just passed to the package.skeleton the list of source files to build the package. As an example, here’s the small R script I’ve done to automate the packaging procedure for my “Remora” package:

cat('\nPacking Remora...\n')

file_lst <- character(5)
file_lst[1] <- '/home/m6/thesis/R/src/1_classes.r'
file_lst[2] <- '/home/m6/thesis/R/src/2_common.r'
file_lst[3] <- '/home/m6/thesis/R/src/3_model.r'
file_lst[4] <- '/home/m6/thesis/R/src/4_predict.r'
file_lst[5] <- '/home/m6/thesis/R/src/5_main.r'   

package.skeleton(name = "remora", force = TRUE, namespace = TRUE,
code_files = file_lst)

cat('\nDone.\n')

Now it's time to go to the directory created, with the name of the package,from now on {package}, and edit the following files:

  • {package}/DESCRIPTION, the description of the package;
  • {package}/NAMESPACE, the list of functions and classes to export to the user;
  • {package}/man/{package}-package.Rd, the package help file, the \examples section must provide executable code, since R check command will execute this code;
  • {package}/man/{class_name}-class.Rd, the classes help files;
  • {package}/man/{function_name}.Rd, the functions help files;

All files under /man/ are tex files and will be compiled to provide the functions help when invoked by the user.
It's only necessary to document the classes and functions that will be exported, i. e. exported in the NAMESPACE file, since all the others will not be visible to the user. All the other .Rd files may be deleted.

After the package has been created, I've tested with the "R CMD check {package}" command. My package name is "remora", so my command was "sudo R CMD check remora".
I had to run this command with the administration role, so I prefixed it with the"sudo" command. This is a characteristic my Kubuntu installation and one may not require to perform this with administrator privilegies.

Finally I've build the package with the "R CMD build {package}" command that created the tar.gz file for distribution.

To install it, just use the "R CMD INSTALL {package}" command. I've entered R and it worked fine.
To uninstall just execute "R CMD REMOVE {package}".

RCP Message Dialog

While developing applications, it’s common to have a common set of functionalities used across most applications.
Message dialogs are one of those examples. In Rich Client Platform (RCP) applications, it may take too much time finding which JFace dialog class suits better for a giving aim…

The following notes will help in identifying which dialog to use. All the methods needed are statically available in the org.eclipse.jface.dialogs.MessageDialog class:

  • MessageDialog.openConfirm, for a confirmation dialog with an Ok/Cancel button set.
  • MessageDialog.openError, for an error dialog with an Ok button.
  • MessageDialog.openInformation, for an information dialog with an Ok button.
  • MessageDialog.openQuestion, for a question dialog with and Yes/No button set.
  • MessageDialog.openWarning, for warning dialog with an Ok button.

The org.eclipse.jface.dialogs.MessageDialogWithToggle is similar,but allows the user to adjust a toggle setting, like Yes Always/Yes/No or Yes/No/Never.

One can use org.eclipse.jface.dialogs.DialogSettings for a dialog setting, supporting loading and saving of properties in an XML file.

One can use org.eclipse.jface.dialogs.ProgressMonitorDialog to display progress during a long running operation.

An, finally, one can design your own dialog windows.
To do it, one just has to extend the org.eclipse.jface.dialogs.IconAndMessageDialog class.

Note: in RCP, the shell can be obtained using
PlatformUI.getWorkbench().getActiveWorkbenchWindow().getShell();.

Subversion project tree structure

Subversion (Svn) has an easy tree structure concept for project versioning.
It has a main trunk, a tag and a branch, each on its own location:

  • {repository}/trunk/{project}
  • {repository}/tags/{project}
  • {repository}/branches/{project}

This is the common structure.

The trunk is were the daily development goes. Here’s a couple of examples:
svn://myserver.net/repository/trunk/trocaqui
svn://myserver.net/repository/trunk/data_generator

The tags is where a tag goes. Here’s the same example:
svn://myserver.net/repository/tags/trocaqui-rev1
svn://myserver.net/repository/tags/data_generator-rev0.3

The branches is where where a branch goes. Again, the same example:
svn://myserver.net/repository/branches/trocaqui-v1.1
svn://myserver.net/repository/branches/data_generator-rc1

For more information about Svn, read Version Control with Subversion.

Interpreting DB2 JDBC error messages

On data migration projects using BD2, sometime there are some awkward DB2/JDBC errors.
It does not matter if it is DB2 on Windows, Unix or iSeries AS/400.

Part of the problem is that the error messages come in localized, on our case in Portuguese, making the debug task a lot harder since it’s almost impossible to find decent help by searching through the error messages. The other part of the problem is that the SQL error codes sometimes are unclear.
Here’s an example of a common problem that we have in data migration projects:

com.ibm.db2.jcc.b.rg: [jcc][t4][102][10040][3.50.152] Non-atomic batch failure. The batch was submitted, but at least one exception occurred
on an individual member of the batch.
Use getNextException() to retrieve the exceptions for specific batched elements.

com.ibm.db2.jcc.b.pm: Error for batch element #1: The current
transaction was rolled back because of error "-289".. SQLCODE=-1476,
SQLSTATE=40506, DRIVER=3.50.152

com.ibm.db2.jcc.b.SqlException: [jcc][103][10843][3.50.152]
[...] ERRORCODE=-4225, SQLSTATE=null

Following DB2 official documentation:

  • SQLCODE=-1476 means that the current transaction was rolled back because of one of the listed errors, but none of them made much sense since none of the errors identified seem to apply.
  • SQLSTATE=40506 means that the current transaction was rolled back because of an SQL error, which is basically the same as the SQLCODE above.
  • ERRORCODE=-4225 means an error occurred when data was sent to a server or received from a server, which is totally useless.

The real useful information is hidden in the second error message, in the ‘transaction was rolled back because of error “-289”‘ message. The key here is the -289 error.
This error means “Unable to allocate new pages in table space“, and this is the real cause for such a big fuss.

In data migration projects it’s common some of the DB2 table spaces to run out of space. But with such an error message, the error itself is kind of hidden in the middle of the stack trace, all that is shown is a loose error code…
For the sake of the developers productivity, the cause of the error should be highlighted and perfectly visible and understood in order to be quickly identified and fixed.

Timezone date and time conversion

Sometimes the dates and times are from different timezones and one needs to convert them into one’s own timezone.
To perform this in Java, one just has to get the original date and time in the original timezone and then format the obtained date and time in the required timezone.

Here’s a snippet explaining how to do convert a date and time from the GMT +1 timezone into the GMT timezone:

// Date and time
String dateTime = "2011-07-01 09:45:12";  

// GMT+1 date and time format definition
DateFormat df1 = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");  
df1.setTimeZone(TimeZone.getTimeZone("GMT+1"));

// Parses the date and time assuming it's in the GMT+1 timezone  
Date dateToConvert = df1.parse(dateTime);  
// Shows the original date and time in the GMT+1 timezone
System.out.println("  GMT+1 date and time: " + df1.format(dateToConvert));
     
// GMT date and time format definition
DateFormat df2 = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");  
df2.setTimeZone(TimeZone.getTimeZone("GMT"));  
   
// Converts the date and time in the final GMT timezone
System.out.println("One own date and time: " + df2.format(dateToConvert));

Find Java Class Path in the System

Many times one has to know where in the system some Java class is located, usually when one has to perform file operations relative to the class location.

In Java such task is not straight forward to accomplish. Here’s a snippet that will return the full path of the class location:

import java.security.CodeSource;
import java.security.ProtectionDomain;

public class Utilities {
 protected static final String getApplicationLocation() {

    final ProtectionDomain pd = Utilities.class.getProtectionDomain();
    final CodeSource cs = pd.getCodeSource();

    return cs.getLocation().getPath().replaceFirst("/", "");
 }
}

Selecting a value from a SWT CCombo on RCP

It looks like the CCombo custom SWT object lacks some selection functionality. When using a CCombo as a table cell editor on a Rich Client Platform (RCP) application, it is (almost) impossible to detect the user selection with both the keyboard and the mouse.

The selection listener does not work as expected. Documentation says:

  • the widgetSelected method is triggered whenever a selection occurs in the control, i.e. when the user browses the option list this event is triggered;
  • the widgetDefaultSelected method is triggered when default selection occurs in the control, i.e. when the user selects an option from the list this event is triggered.

One might think that all one has to do is to catch the widgetDefaultSelected event and one would know which option the user has selected from the list.
This is only true when the user performs the selection using the keyboard, i. e. after browsing through the options list the Enter/Return key is pressed.
If the user decides to use the mouse, the widgetDefaultSelected event is not triggered at all, but widgetSelected is.

One may though one could detect the user selection with the mouse listener. But it turns out there’s no (easy) way to detect if the user has performed a selection using the mouse…

Here’s a workaround for it. Since the widgetSelected is triggered by the mouse clicks, one may think in trying to use that event. Unfortunately the event has no real useful information, like if it was triggered by a right button mouse click. But fortunately, the CCombo object does have a couple of methods that allows to infer that a selection has been made. In particular, it has a method to check if the options list is visible or not. Since a selection with the mouse makes the options list disappear, one can use it.

Here’s a code snippet that does it:

// Selections events
combo.addSelectionListener(new SelectionAdapter() {
// Selection changed, check if the options are still visible
public void widgetSelected(SelectionEvent e) {
  // If list is not visible assume that the user has
  // performed selection with the mouse
  if (!combo.getListVisible()) {
      setEditionValue(combo);
  }
}

// Option selected
public void widgetDefaultSelected(SelectionEvent e) {
  // User performed a selection with the keyboard
  setEditionValue(combo);
}
});

This is not a perfect solution, it’s a workaround, but it’s a very helpful working solution.

Debug in R

Development in R usually is performed using JGR, or Tinn-R, editors and the R console.
Since these are no “state of the art” IDEs for R development, this is usually the best one can have.

So, how does one debug a program in R?
The usual answer is classical: with print.
But the correct answer is: using the browser() function.

The browser() function works as a break point marker.
Just write browser() where a break point is needed and run the program.
When browser() is interpreted, the program hang its execution and a new command line prompt will be provided, Browse[1]>.
This command line allows one to interact with the variables available in the scope where the program is temporary hanged.
This debug command line accepts the following debug commands:

  • n: next, advances to the next line, allows step-by-step.
  • c: continue, continues the execution, until the program ends or a new browser() function is found.
  • traceback(), shows the stack trace function call.
  • Q: quit, stop the program execution.

It does not look as cool as using the mouse to insert a break point on a line, but it works.

RCP Save Workbench Status

In Rich Client Platform (RCP) application, the workbench status can be easily saved and restored by simply including the snippet bellow in the ApplicationWorkbenchAdvisor class, that extends WorkbenchAdvisor.

@Override
public void initialize(IWorkbenchConfigurer configurer) {
super.initialize(configurer);

// tell eclipse to save workbench state when it quits
// (things like window size, view layout, etc.)
configurer.setSaveAndRestore(true);
}

This will automatically save the status of the workbench when the application is closed and will automatically restore that status when the application is executed again. The settings are saved on the [runtime-{your-product}]/.metadata/.plugins/org.eclipse.ui.workbench/workbench.xml file.

Reorder Columns in R

There is a very easy way to reorder the column order or a data frame, but it seems to be unknown from many R users.

Having a data frame named test like

     a  b  c  d  e
[1,] 1 11 21 31 41
[2,] 2 12 22 32 42
[3,] 3 13 23 33 43
[4,] 4 14 24 34 44
[5,] 5 15 25 35 45

all that is required is to reassign the data frame using a new column order, like

new_column_order <- c("b", "c", "a", "e", "d")

test <- test[,new_column_order] 

And that solves the problem.

This works both with a data frame and a matrix.