Tuesday, March 20, 2012

Installing MongoDB on Fedora 16

Needing to install MongoDB for some quick tests, I found lots of blogs/wikis about how to get started. All inaccurate regarding Fedora 16: no need at all to add additional repositories.

Install


sudo yum install mongodb-server mongodb

Start


sudo service mongod start

done!

Optional: some more insight


Configuration

File is at:
/etc/mongodb.conf

Interesting options

# disallow table scans; if we do, I'd fix our code:
notablescan = true
# enable HTTP interface
nohttpinterface = false
# Verbose logging output.
verbose = true

Where is the stuff?

dbpath=/var/lib/mongodb
pidfilepath=/var/run/mongodb/mongodb.pid
logpath=/var/log/mongodb/mongodb.log

Local console: http://localhost:28017/

Have MongoDB start as a service

systemctl enable mongod.service

Thursday, August 26, 2010

Using BoxGrinder meta-appliance to create custom EC2 AMIs

While I'm happy with BoxGrinder, I don't use it on a daily basis and keep forgetting the trivial instructions; so instead of stressing the project mantainers with the same questions I'll write the instructions here as a reference for myself and a tutorial for whoever might be interested.

What is BoxGrinder?
It's a Ruby tool which creates virtual machines from simple definition files; these Appliance Definition files are cloud-vendor and virtualization tecnology neutral, so you define your appliance once and with different plugins you have them running on different virtualization platforms. So while I'm experimenting my builds with Amazon's EC2, I know I'll be able to provide the same appliance to VMWare systems, VirtualBox, KVM, even create bootable USB keys.

It currently supports Fedora, Red Hat, CentOS operating systems and built images can be run on several targets; it also uploads the built image, in my case using EC2 it will register the new AMI for me.

For a complete introduction, there's a nice video on the official website.

Following instructions are collected from the official documentation which you can find here, I only cherry-picked what I'm interested in for my own goals, so make sure to look at the full documentation in case you need more options or target different virtualization providers.

Meta-Appliance
A meta-appliance is an appliance containing almost all what we need to build other appliances; so instead of installing all of the needed software locally, you can grab one from EC2's AMI catalogue and use it. I like this option as you'll end up having to upload your AMI to S3, so if you assemble it directly on Amazon's systems this process will be way quicker and spare your bandwith.
When selecting a meta-appliance, make sure you select the appropriate architecture: as of BoxGrinder 0.5.1 you need a 64bit meta-appliance to build 64bit AMIs, or a 32bit meta-appliance to build a 23bit AMI (this last limitation will likely be resolved soon as BGBUILD-46).

There we go:
start an instance of ami-96db30ff and connect with your private key using SSH.

Then update the system, and install some tools:

yum update
rpm -Uvh http://s3.amazonaws.com/ec2-downloads/ec2-ami-tools.noarch.rpm
yum install createrepo subversion
gem install boxgrinder-build boxgrinder-build-ec2-platform-plugin boxgrinder-build-s3-delivery-plugin boxgrinder-build-fedora-os-plugin

Ec2-ami-tools is a requirement to bundle the AMIs on S3, createrepo is useful to integrate own-built RPMs, and finally I use subversion to manage the appliance definition files.

Now create some directories and the local RPM repositories:

mkdir -p /opt/repo/RPMS/i386 /opt/repo/RPMS/noarch /opt/repo/RPMS/x86_64 ~/.boxgrinder/plugins
createrepo /opt/repo/RPMS/i386 && createrepo /opt/repo/RPMS/x86_64 && createrepo /opt/repo/RPMS/noarch

Enter S3 credentials in the plugin configuration to be able to upload your AMI:


vi ~/.boxgrinder/plugins/s3

The configuration looks like this, just omitting the passwords:

access_key: xx
secret_access_key: xxx
bucket: scarlet-private-amis
account_number: 3441-4397-4060
cert_file: /root/cert.pem
key_file: /root/soseaws.pem

You have to upload the cert.pem and soseaws.pem files, these are your private keys for accessing all of AWS services.
Now get the appliance definitions:

svn co https://dev.sourcesense.com/repos/dev/scarlet/trunk/appliance-build/appliances appliances --no-auth-cache --username s.grinovero

And create an AMI:

boxgrinder-build appliances/common-scarlet.appl -p ec2 -d ami

This will end as:

I, [2010-08-26T09:14:52.070877 #18565] INFO -- : Bundling AMI...
I, [2010-08-26T09:16:44.546519 #18565] INFO -- : Bundling AMI finished.
I, [2010-08-26T09:16:44.547023 #18565] INFO -- : Uploading common-scarlet AMI to bucket 'scarlet-private-amis'...
I, [2010-08-26T09:17:43.906981 #18565] INFO -- : Image successfully registered under id: ami-723ed41b.

Done! Now take a note of the registered AMI id and use it to start N-copies of it.

Appliance Definition files
Again, to see all the options read the Stormgrind documentation, anyway an appliance definition looks like:

Example definition

name: common-scarlet
summary: Base AMI definition common to all scarlet AMIs
version: 2
release: 1
os:
  name: fedora
  version: "13"
  password: xxx
hardware:
  cpus: 2
  memory: 512
  partitions:
    /:
      size: 3
packages:
  includes:
    - bash
    - yum
    - vim-minimal
    - openssh-server
    - chkconfig
    - acpid
    - dhclient
    - openssh-clients
    - mc
    - subversion
    - mutt
    - rsync
    - s3cmd
repos:
  - name: "local-noarch"
    baseurl: "file:///opt/repo/RPMS/noarch"
    ephemeral: true
  - name: "local-#ARCH#"
    baseurl: "file:///opt/repo/RPMS/#ARCH#"
    ephemeral: true

Note the pointers to the local RPM repository, there you can add RPM packages which you want to be included in the built appliance, in my case my own application which is not available in the default Fedora repositories.

Hudson Definition
to have a nice Fedora 13 with all latest updates running latest Hudson, just have BoxGrinder grind this definition:

name: hudson-scarlet
summary: Hudson instance to run builds of Scarlet
version: 2
release: 1
os:
  name: fedora
  version: 13
  password: yyy
hardware:
  cpus: 2
  memory: 512
  partitions:
    /:
      size: 3
    /var/lib/hudson/:
      size: 50
appliances:
  - common-scarlet
packages:
  includes:
    - hudson
    - java-1.6.0-openjdk
repos:
  - name: "hudson-repo"
    baseurl: "http://hudson-ci.org/redhat/"
post:
  base:
    - "/sbin/chkconfig --level 345 hudson on"

Note the inheritance to the appliance of the previous example: this appliance will contain also the parent's packages, then an additional repository is enabled to download Hudson and a large partitions is dedicated to it.
Finally, a post-build script is started which will make sure Hudson is started, you won't even need to login to the machine to configure it.

Tuesday, December 8, 2009

Blogging now also at in.relation.to

From today I'm also writing on the very cool in.relation.to blog.
This is the collective blog of experts from the Seam and Hibernate teams - I'm reading it since years - and am honored they invited me to write about my contribution to Hibernate Search.
It's amazing how these very esteemed developers welcome contributions and are open to any kind of discussion.

Recently I saw some very sad statements about the JBoss community not being truly open, or not being meritocratic; I think that people who believe that either didn't ever try to really contribute, or had met the wrong person at the wrong moment: as all communities, they are big and made of humans.
When I started willing to contribute I wasn't an expert at all, still I was welcomed for my interest in the project and I always got - and still get - polite answers even to my most silly questions and doubts. After the traditional couple of patches were accepted, I slowly began feeling as part of a team. I might have been lucky, but luck has endured as every single person I keep meeting in these groups is at the same time very kind, smart and helpful. Just keep in mind they're all very busy: an answer could take some time.

So today I wrote this post about Hibernate Search's new MassIndexer: read it, comment about it, make use of it! Then ask for improvements and join the fun :-)

Tuesday, September 8, 2009

What is Hibernate Search?


I'm getting this question relatively often, so I think that existing information online is making too many assumptions, or is too practical; I'll try to fill this gap with a very basic introduction.
Hibernate Search is an open source Java project which integrates Hibernate with Lucene; both libraries have proven themselves extremely useful and are stable and of widespread use; in practice many projects face the need to use both.
Unfortunately the String world of Lucene is quite different than the Hibernate world and every project trying to integrate both is doomed to face the same problems, to rewrite more or less the same glue, to have more code to maintain because of their own bugs, or because of API changes in one or both frameworks.

Lucene
Is an Apache library which provides full-text capabilities: you create an index (in memory, on filesystem, in a database,...) and then you can search this index on keywords, phrases, boolean queries, etc..
The results are commonly returned by relevance, so the best matching documents are returned first (think of it as a web search engine like google); the main point is that you have full control about how your items are parsed before entering the string world of the index, to choose which information is important for your business, how you define the matching rules. It is very fast and generally considered stable, still new features are constantly added.
Being extremely flexible, working directly with Lucene is like programming "low level" so often applications introduce a separation layer to standardize the way it is used across an application, thus hiding some of the flexibility and possibly introducing some helpers.

Hibernate
the aim of this very successful open source project is to simplify the interaction between the application and the database; technically it's and Object-Relational Mapping service; you'll find plenty of information and tutorials about it on the web. The important point to introduce Hibernate Search is that it makes you use POJOs to define the domain model of your application, annotating them to define the mapping to the database, and provides good APIs and even an object-oriented query language to interact with the database, all nicely fitted in a transactional world.

Hibernate Search
Hibernate Search is built on top of Lucene, like Hibernate is built on top of your SQL database. As Hibernate maps POJOs to tables, Hibernate Search maps them with to Lucene's index introducing a new set of annotations. The interesting point here is that you annotate with both families of annotations the same entities, and when you make an Hibernate query to the database or a Lucene query to the index, you'll get Hibernate managed entities in both cases. You define your domain model - which is unique - and how it maps to the database and to the index. When you make changes to your data the service will update both database and index at transaction commit.
The API to run and paginate queries is an extension of Hibernate's (and JPA) API, so the changes in an application to introduce full-text capabilities are minimal.
When using Lucene the code ususally gets quite verbose, like when defining Analyzers or Filters; with Hibernate Search you can define these declaratively and reuse them by name. Last but not least it makes use of several performance improving tricks, like: sharing file buffers across concurrent reading sessions, caching filter results, batching index changes, clustering solutions. All nice capabilities which you don't need to know, but they are there in case you'll need them.

Flexibility
Even being a simplifying layer between the application and Lucene, it won't hide any advanced feature but provide tools to make use of them. Developers can customize all aspects: from defining custom bridges for your types up to replacing/extending whole parts of the framework. Each mayor component can be replaced with custom code: define your own index storage strategy by creating a custom DirectoryProvider, use your own LockManager, create a new IndexShardingStrategy, fine-tune all performance settings which Lucene exposes. If you're still missing something, you're free to change the code and submit patches.

Websites:
Hibernate Search - website
Hibernate Search - forums
Lucene's Java implementation website

Books:
Hibernate Search in Action
Java Persistence with Hibernate
Lucene in Action, Second Edition