Technology Answer: March 2011

Thursday, March 31, 2011

Use Spring options tag to display enum's toString value

I'm using Spring for an HTML form. One of the fields is an enum and thus I'd like a HTML drop-down list (<option> tag). My enum's name is different than the toString() value. For example:

public enum Size {
    SMALL("Small"), LARGE("Large"), VERY_LARGE("Very large");

    private final String displayName;

    private Size(String displayName) {
        this.displayName = displayName;
    }

    public String toString() {
        return displayName;
    }
}

I want the user to see the toString() value. Normally this is accomplished using the itemLabel of the Spring options tag.

<form:options items="${enumValues}" itemLabel="beanProperty" />

But toString() isn't a bean property as it doesn't start with "get". Is there a way to set itemLabel to use toString's value without having to create a getter?

From stackoverflow

Why not add a public getDisplayName() method to your enum?

Steve Kuo : I'm trying to avoid adding this extra method to each enum. It doesn't help that enum can't be extended.

Ways to Search for a Pattern in all Stored Procedures and then Open it to be Altered

How can you search all the Stored Procedures for a Pattern and then open the stored procedures to be edited?

Is there anything built inside of SQL Server 2005?

Or are there any 3rd party addins that will do this searching?

I am also using Red Gate's SQL Prompt but I have not noticed that option.

Currently I am using the following command to do the searching

SELECT ROUTINE_SCHEMA, ROUTINE_NAME, ROUTINE_DEFINITION 
    FROM INFORMATION_SCHEMA.ROUTINES 
    WHERE ROUTINE_DEFINITION LIKE '%tblVacationAllocationItem%' 
    AND ROUTINE_TYPE='PROCEDURE'
    ORDER BY ROUTINE_SCHEMA, ROUTINE_NAME

This works pretty good but it returns the content of the stored procedure in one of the columns, which is hard to read. So I have to use the Object Explorer to find and open the stored procedure to see the full content.

Edited: SQL Dependency Tracker allows you to dynamically explore all your database object dependencies, using a range of graphical layouts. This looks like it would answer a few of the questions when searching for a pattern. Any other software similar to SQL Dependency Tracker?

From stackoverflow

Unfortunately, there is no feature in SQL Server 2005 other than querying, to find stored procedures based on their contents. The only thing you can do is filter by name in the object explorer.

I use Visual Studio Database Edition to accomplish this task.

There is a open source stored procedure called sp_grep that allows for you to find database objects based on the DDL/code of their makup. I use this procedure all the time to find objects that meet certain criteria. This is very useful in Database refactoring.

To programmatically open and modify SQL objects you can use the SQLDMO object in any .Net application. Here is some examples of using SQLDMO.

Example: exec sp_grep 'colA='

SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER OFF
GO

/*********************************************************************
* Stored procedure  sp_grep 
* SQL Server:   Microsoft SQL Server 6.0, 4.21 for Windows NT, 
*               Microsoft SQL Server 4.2 for OS/2.
* Author:       Andrew Zanevsky, AZ Databases, Inc.
* Version/Date: Version 1.1,  October 26, 1995
* Description:  Searches syscomments table in the current database
*               for occurences of a combination of strings. 
*               Correclty handles cases when a substring begins in 
*               one row of syscomments and continues in the next. 
* Parameters: - @parameter describes the search:
*               string1 {operation1 string2} {operation2 string 3} ...
*               where - stringN is a string of characters enclosed in
*                       curly brackets not longer than 80 characters. 
*                       Brackets may be omitted if stringN does not 
*                       contain spaces or characters: +,-,&;
*                     - operationN is one of the characters: +,-,&.
*               Parameter is interpreted as follows:
*               1.Compose the list of all objects where string1 occurs.
*               2.If there is no more operations in the parameter,
*                 then display the list and stop. Otherwise continue.
*               3.If the next operation is + then add to the list all 
*                   objects where the next string occurs;
*                 else if the next operation is - then delete from the 
*                   list all objects where the next string occurs;
*                 else if the next operation is & then delete from the 
*                   list all objects where the next string does not 
*                   occur (leave in the list only those objects where 
*                   the next string occurs);
*               4.Goto step 2.
*               Parameter may be up to 255 characters long, and may not 
*               contain <CarriageReturn> or <LineFeed> characters.
*               Please note that operations are applied in the order
*               they are used in the parameter string (left to right). 
*               There is no other priority of executing them. Every 
*               operation is applied to the list combined as a result 
*               of all previous operations.
*               Number of spaces between words of a string matters in a
*               search (e.g. "select *" is not equal to "select  *").
*               Short or frequently used strings (such as "select") may 
*               produce a long result set.
*
*             - @case: i = insensitive / s = sensitive (default)
*               Insensitive search is performed regardless of this parameter 
*               if SQL Server is set up with case insensitive sort order.
*
* Examples:     sp_grep employee 
*                 list all objects where string 'employee' occurs;
*               sp_grep employee, i
*                 list all objects where string 'employee' occurs in 
*                 any case (upper, lower, or mixed), such as 
*                 'EMPLOYEE', 'Employee', 'employee', etc.;
*               sp_grep 'employee&salary+department-trigger'
*                 list all objects where either both strings 'employee'
*                 and 'salary' occur or string 'department' occurs, and 
*                 string 'trigger' does not occur;
*               sp_grep '{select FirstName + LastName}'
*                 list all objects where string 
*                 "select FirstName + LastName" occurs;
*               sp_grep '{create table}-{drop table}'
*                 list all objects where tables are created and not 
*                 dropped.
*                 
**********************************************************************/

-- sp_grep   v1.0 03/16/1995, v1.1 10/26/1995
-- Author:   Andrew Zanevsky, AZ Databases, Inc. 
-- E-mail:   zanevsky@azdatabases.com
ALTER proc [dbo].[sp_grep] @parameter varchar(255) = null, @case char(1) = 's'
as

declare @str_no          tinyint, 
        @msg_str_no      varchar(3),
        @operation       char(1), 
        @string          varchar(80), 
        @oper_pos        smallint,
        @context         varchar(255),
        @i               tinyint,
        @longest         tinyint,
        @msg             varchar(255)

if @parameter is null /* provide instructions */
begin
    print 'Execute sp_grep "{string1}operation1{string2}operation2{string3}...", [case]'
    print '- stringN is a string of characters up to 80 characters long, '
    print '  enclosed in curly brackets. Brackets may be omitted if stringN '
    print '  does not contain leading and trailing spaces or characters: +,-,&.'
    print '- operationN is one of the characters: +,-,&. Interpreted as or,minus,and.'
    print '  Operations are executed from left to right with no priorities.'
    print '- case: specify "i" for case insensitive comparison.'
    print 'E.g. sp_grep "alpha+{beta gamma}-{delta}&{+++}"'
    print '     will search for all objects that have an occurence of string "alpha"'
    print '     or string "beta gamma", do not have string "delta", '
    print '     and have string "+++".'
    return
end

/* Check for <CarriageReturn> or <LineFeed> characters */
if charindex( char(10), @parameter ) > 0 or charindex( char(13), @parameter ) > 0
begin
    print 'Parameter string may not contain <CarriageReturn> or <LineFeed> characters.'
    return
end

if lower( @case ) = 'i'
        select  @parameter = lower( ltrim( rtrim( @parameter ) ) )
else
        select  @parameter = ltrim( rtrim( @parameter ) )

create table #search ( str_no tinyint, operation char(1), string varchar(80), last_obj int )
create table #found_objects ( id int, str_no tinyint )
create table #result ( id int )

/* Parse the parameter string */
select @str_no = 0
while datalength( @parameter ) > 0
begin
  /* Get operation */
  select @str_no = @str_no + 1, @msg_str_no = rtrim( convert( char(3), @str_no + 1 ) )
  if @str_no = 1
    select  @operation = '+'
  else 
  begin
    if substring( @parameter, 1, 1 ) in ( '+', '-', '&' )
        select  @operation = substring( @parameter, 1, 1 ),
                @parameter = ltrim( right( @parameter, datalength( @parameter ) - 1 ) )
    else
    begin
        select @context = rtrim( substring( 
                        @parameter + space( 255 - datalength( @parameter) ), 1, 20 ) )
        select @msg = 'Incorrect or missing operation sign before "' + @context + '".'
        print  @msg 
        select @msg = 'Search string ' + @msg_str_no + '.'
        print  @msg 
        return
    end
  end

  /* Get string */
  if datalength( @parameter ) = 0
  begin
      print 'Missing search string at the end of the parameter.'
      select @msg = 'Search string ' + @msg_str_no + '.'
      print  @msg 
      return
  end
  if substring( @parameter, 1, 1 ) = '{'
  begin
      if charindex( '}', @parameter ) = 0
      begin
          select @context = rtrim( substring( 
                      @parameter + space( 255 - datalength( @parameter) ), 1, 200 ) )
          select @msg = 'Bracket not closed after "' + @context + '".'
          print  @msg 
          select @msg = 'Search string ' + @msg_str_no + '.'
          print  @msg 
          return
      end
      if charindex( '}', @parameter ) > 82
      begin
          select @context = rtrim( substring( 
                      @parameter + space( 255 - datalength( @parameter) ), 2, 20 ) )
          select @msg = 'Search string ' + @msg_str_no + ' is longer than 80 characters.'
          print  @msg 
          select @msg = 'String begins with "' + @context + '".'
          print  @msg 
          return
      end        
      select  @string    = substring( @parameter, 2, charindex( '}', @parameter ) - 2 ),
              @parameter = ltrim( right( @parameter, 
                              datalength( @parameter ) - charindex( '}', @parameter ) ) )
  end
  else
  begin
      /* Find the first operation sign */
      select @oper_pos = datalength( @parameter ) + 1
      if charindex( '+', @parameter ) between 1 and @oper_pos
          select @oper_pos = charindex( '+', @parameter )
      if charindex( '-', @parameter ) between 1 and @oper_pos
          select @oper_pos = charindex( '-', @parameter )
      if charindex( '&', @parameter ) between 1 and @oper_pos
          select @oper_pos = charindex( '&', @parameter )

      if @oper_pos = 1
      begin
          select @context = rtrim( substring( 
                      @parameter + space( 255 - datalength( @parameter) ), 1, 20 ) )
          select @msg = 'Search string ' + @msg_str_no + 
                        ' is missing, before "' + @context + '".'
          print  @msg 
          return
      end        
      if @oper_pos > 81
      begin
          select @context = rtrim( substring( 
                      @parameter + space( 255 - datalength( @parameter) ), 1, 20 ) )
          select @msg = 'Search string ' + @msg_str_no + ' is longer than 80 characters.'
          print  @msg 
          select @msg = 'String begins with "' + @context + '".'
          print  @msg 
          return
      end        

      select  @string    = substring( @parameter, 1, @oper_pos - 1 ),
              @parameter = ltrim( right( @parameter, 
                              datalength( @parameter ) - @oper_pos + 1 ) )
  end
  insert #search values ( @str_no, @operation, @string, 0 )

end
select @longest = max( datalength( string ) ) - 1
from   #search
/* ------------------------------------------------------------------ */
/* Search for strings */
if @case = 'i'
begin
    insert #found_objects
    select a.id, c.str_no
    from   syscomments a, #search c
    where  charindex( c.string, lower( a.text ) ) > 0

    insert #found_objects
    select a.id, c.str_no
    from   syscomments a, syscomments b, #search c
    where  a.id        = b.id
    and    a.number    = b.number
    and    a.colid + 1 = b.colid
    and    charindex( c.string, 
                lower( right( a.text, @longest ) + 
/*                     space( 255 - datalength( a.text ) ) +*/
                       substring( b.text, 1, @longest ) ) ) > 0
end
else
begin
    insert #found_objects
    select a.id, c.str_no
    from   syscomments a, #search c
    where  charindex( c.string, a.text ) > 0

    insert #found_objects
    select a.id, c.str_no
    from   syscomments a, syscomments b, #search c
    where  a.id        = b.id
    and    a.number    = b.number
    and    a.colid + 1 = b.colid
    and    charindex( c.string, 
                right( a.text, @longest ) + 
/*              space( 255 - datalength( a.text ) ) +*/
                substring( b.text, 1, @longest ) ) > 0
end
/* ------------------------------------------------------------------ */
select distinct str_no, id into #dist_objects from #found_objects
create unique clustered index obj on #dist_objects  ( str_no, id )

/* Apply one operation at a time */
select @i = 0
while @i < @str_no
begin
    select @i = @i + 1
    select @operation = operation from #search where str_no = @i

    if @operation = '+'
        insert #result
        select id
        from   #dist_objects 
        where  str_no = @i
    else if @operation = '-'
        delete #result
        from   #result a, #dist_objects b
        where  b.str_no = @i
        and    a.id = b.id
    else if @operation = '&'
        delete #result
        where  not exists 
                ( select 1
                  from   #dist_objects b
                  where  b.str_no = @i
                  and    b.id = #result.id )
end

/* Select results */
select distinct id into #dist_result from #result

/* The following select has been borrowed from the sp_help 
** system stored procedure, and modified. */
select  Name        = o.name,
        /* Remove 'convert(char(15)' in the following line 
        ** if user names on your server are longer. */
        Owner       = convert( char(15), user_name(uid) ),
        Object_type = substring(v.name + x.name, 1, 16)
from    #dist_result           d,
        sysobjects             o, 
        master.dbo.spt_values  v,
        master.dbo.spt_values  x
where   d.id = o.id
/* SQL Server version 6.x uses 15, prior versions use 7 in expression below */
and     o.sysstat & ( 7 + 8 * sign( charindex( '6.', @@version ) ) ) = v.number
and     v.type = "O"
and     x.type = "R"
and     o.userstat & -32768 = x.number
order by Object_type desc, Name asc

Rune Grimstad : And it works! :-) Great stuff

To query the definition of an object, one could use syscomments. For example:
```
select * from syscomments where text like '%tblVacationAllocationItem%'
```
While this will work for most scenarios, if the definition is longer than 4000 characters, there will exist multiple syscomment rows for a single object. Although unlikely, it is possible that your search phrase spans multiple syscomment rows.
Not the answer to your question, but we save all our SProcs as separate files - easier to globally make changes using a Programmer's Editor, and they are easy to get into version management repository (SVN in our case)
I have posted the code in the following article for a Stored Proc that works in SQL 2000 and above that does a comprehensive search (across Procs, Functions, Views, Defaults, Jobs, etc.) and can optionally ignore comments and optionally ignore string literals. You can find it at SQLServerCentral.com :

http://www.sqlservercentral.com/articles/Stored+Procedure/62975/

It does not automatically open anything for editing but does give the line number of where it matches the text and takes into account the 4000-character chunks as mentioned by Cadaeic. I was planning on updating the article soon to include a loop across all databases and even to include some of my SQL# RegEx functions (for Object Name matching as well as the search string).

Gerhard Weiss : Thanks Solomon. I actually use SQL#. (Very nice product!) Any enhancements with using RegEx would be most appretiated.

Gerhard Weiss : I am getting a 'String or binary data would be truncated.' I did modify .sql to be run inline instead of a stored procedure so I am going to look into it to see if my changes effected this. I am using SQL Server 2005.

begin
--select column_name from INFORMATION_SCHEMA.COLUMNS where TABLE_NAME='Products' 
--Declare the Table variable 
DECLARE @GeneratedStoredProcedures TABLE
(
  Number INT IDENTITY(1,1), --Auto incrementing Identity column
  name VARCHAR(300) --The string value
)

--Decalre a variable to remember the position of the current delimiter
DECLARE @CurrentDelimiterPositionVar INT 
declare @sqlCode varchar(max)
--Decalre a variable to remember the number of rows in the table
DECLARE @Count INT

--Populate the TABLE variable using some logic
INSERT INTO @GeneratedStoredProcedures SELECT name FROM sys.procedures where name like 'procGen_%'

--Initialize the looper variable
SET @CurrentDelimiterPositionVar = 1

--Determine the number of rows in the Table
SELECT @Count=max(Number) from @GeneratedStoredProcedures

--A variable to hold the currently selected value from the table
DECLARE @CurrentValue varchar(300);

--Loop through until all row processing is done
WHILE @CurrentDelimiterPositionVar <= @Count
BEGIN
 --Load current value from the Table
 SELECT @CurrentValue = name FROM @GeneratedStoredProcedures WHERE Number = @CurrentDelimiterPositionVar 
 --Process the current value
 --print @CurrentValue
 set @sqlCode = 'drop procedure ' + @CurrentValue
 print @sqlCode
 --exec (@sqlCode)


 --Increment loop counter
 SET @CurrentDelimiterPositionVar = @CurrentDelimiterPositionVar + 1;
END

end

Sorting based on calculation with nhibernate - best practice

I need to do paging with the sort order based on a calculation. The calculation is similar to something like reddit's hotness algorithm in that its dependant on time - time since post creation.

I'm wondering what the best practice for this would be. Whether to have this sort as a SQL function, or to run an update once an hour to calculate the whole table.

The table has hundreds of thousands of rows. And I'm using nhibernate, so this could cause problems for the scheduled full calcution.

Any advice?

From stackoverflow

If you can perform the calculation using SQL, try use Hibernate to load the sorted collection by executing a SQLQuery, where your query includes a 'ORDER BY' expression.
It most likely will depend a lot on the load on your server. A few assumptions for my answer:
1. Your calculation is most likely not simple, but will take into account a variety of factors, including time elapsed since post
2. You are expecting at least reasonable growth in your site, meaning new data will be added to your table.
I would suggest your best bet would be to calculate and store your ranking value, and as Nuno G mentioned retrieve using an ordered clause. As you note there are likely to be some implications, two of which would be:
1. Scheduling Updates
2. Ensuring access to the table
As far as scheduling goes you may be able to look at some ways of intelligently recalculating your value. For example, you may be able to identify when a calculation is likely to be altered (for example, if a dependant record is updated you might fire a trigger, adding the ID of your table to a queue for recalculation). You may also do the update in ranges, rather then in the full table.

You will also want to minimise any locking of your table whilst you are recalculating. There are a number of ways to do this, including setting your isolation levels (using MS SQL terminonlogy). If you are really worried you could even perform your calculation externally (eg. in a temp table) and then simply run an update of the values to your main table.

As a final note I would recommend looking into the paging options available to you - if you are talking about thousands of records make sure that your mechanism determines the page you need on the SQL server so that you are not returning the thousands of rows to your application, as this will slow things down for you.

Mapping a global variable from a shared library with ctypes

I'd like to map an int value pbs_errno declared as a global in the library libtorque.so using ctypes.

Currently I can load the library like so:

from ctypes import *
libtorque = CDLL("libtorque.so")

and have successfully mapped a bunch of the functions. However, for error checking purposes many of them set the pbs_errno variable so I need access to that as well. However if I try to access it I get:

>>> pytorque.libtorque.pbs_errno
<_FuncPtr object at 0x9fc690>

Of course, it's not a function pointer and attempting to call it results in a seg fault.

It's declared as int pbs_errno; in the main header and extern int pbs_errno; in the API header files.

Objdump shows the symbol as:

00000000001294f8 g    DO .bss   0000000000000004  Base        pbs_errno

From stackoverflow

There's a section in the ctypes docs about accessing values exported in dlls:

http://docs.python.org/library/ctypes.html#accessing-values-exported-from-dlls

e.g.

def pbs_errno(): return c_int.in_dll(libtorque, "pbs_errno")

Kamil Kisiel : Thanks. Somehow I missed that section when using the docs. It's a big manual with a ton of varying content, easy to glance over and not notice :/

LINQ to Entities how to return all records from the Parent table?

I am unable to select all the records from the parent table using Linq to Entities.

This is a simple DB design (image below):

Image Link
Table relation

This is the exact output I want using Linq to Entities or Linq to SQL (image below):

Image Link
Sql Results

When I use Linq to Entities or Linq To Sql I can only get the records from the child table that has a foreign key relation. I am not able to get the null values as shown above.

I want to have the null values to show just like when you use ‘left outer join’.

Thanks for any help.

From stackoverflow

```
from entity in MyContext.EntityType.Include("ChildEntitiesNavigationPropertyName")
select entity;
```
This returns all instances of EntityType, plus ChildEntitiesNavigationPropertyName when/if it exists. For tabular form use an anonymous type:
```
from entity in MyContext.EntityType.Include("ChildEntitiesNavigationPropertyName")
select new {ParentProperty = entity.ParentProperty, 
            ChildProperty  = entity.ChildEntitiesNavigationPropertyName.ChildProperty};
```
For a 1..* property:
```
from entity in MyContext.EntityType.Include("ChildEntitiesNavigationPropertyName")
from child in entity.ChildEntitiesNavigationPropertyName.DefaultIfEmpty()
select new {ParentProperty = entity.ParentProperty, 
            ChildProperty  = child.ChildProperty};
```
EZ : When I do example it just says childproperty is not defined. All though works From Child in entity.ChildEntitiesNavigationPropertyName, which does not give me the Null records I want.

Craig Stuntz : You must be using a 1..* property then. In that case, use group by to join. I'll update the example.

EZ : By the way are you seeing my images in the question? The images show up when I edit the question but cannot see them when view the question.

Craig Stuntz : No. I'm guessing, because I can't see them.

EZ : Ok put links to the images, not sure why the images are not showing. Thanks for all your help.

EZ : Craig created another question more specific. http://stackoverflow.com/questions/544378/linq-to-entities-how-to-return-all-records-from-parent
I'm pretty sure you can select from employees and then do a left join in LINQ, something like this (I don't have VS on this machine):
```
var results = from e in dbContext.Employees join s in dbContext.Sales on e.EmployeeID equals s.EmployeeID select new { e, s };
```
You may want to select the columns you just want. Hope it gives you the results you want.

achinda99 : If there is no answer by the time I get into work tomorrow, I'll work on it. But instead of trying to work from the Sales up to the Employees, you need to join Employees with Sales.

EZ : The lead table is Employee, just can't beleive that a simple query is to difficult. Thanks for your help.

achinda99 : I completely forgot! Let me have a look and see now.

Problem With Regular Expression to Remove HTML Tags

In my Ruby app, I've used the following method and regular expression to remove all HTML tags from a string:

str.gsub(/<\/?[^>]*>/,"")

This regular expression did just about all I was expecting it to, except it caused all quotation marks to be transformed into “ and all single quotes to be changed to ” .

What's the obvious thing I'm missing to convert the messy codes back into their proper characters?

Edit: The problem occurs with or without the Regular Expression, so it's clear my problem has nothing to do with it. My question now is how to deal with this formatting error and correct it. Thanks!

From stackoverflow

You could use a multi-pass system to get the results you are looking for.

After running your regular expression, run an expression to convert &8220; to quotes and another to convert &8221; to single quotes.
You're going to run into more trouble when you see something like:
```
<doohickey name="<foobar>">
```
You'll want to apply something like:
```
gsub(/<[^<>]*>/, "")
```
...for as long as the pattern matches.

R.. : Unless you meant that the OP should be prepared to deal with bogus/invalid HTML, you're wrong. This form will never appear in correct HTML.
This regular expression did just about all I was expecting it to, except it caused all quotation marks to be transformed into “ and all single quotes to be changed to ” .

This doesn't sound as if the RegExp would be doing this. Are you sure it's different before?

See this question here for information about the problem, it has got an excellent answer:
Get non UTF-8 form fields as UTF-8 in php.

Bryan Woods : Holy cow, you're right. I noticed it after adding the Regex, but the effect on characters happens either way. I just hadn't noticed due to it being less obvious. So my question becomes: How do I fix this formatting?
I've run into a similar problem with character changes, this happened when my code ran through another module that enforced UTF-8 encoding and then when it came back, I had a different file (slurped array of lines) on my hands.
Use CGI::unescapeHTML after you perform your regular expression substitution:
```
CGI::unescapeHTML(str.gsub(/<\/?[^>]*>/,""))
```
See http://www.ruby-doc.org/core/classes/CGI.html#M000547

In the above code snippet, gsub removes all HTML tags. Then, unescapeHTML() reverts all HTML entities (such as <, &#8220) to their actual characters (<, quotes, etc.)

With respect to another post on this page, note that you will never ever be passed HTML such as
```
<tag attribute="<value>">2 + 3 < 6</tag>
```
(which is invalid HTML); what you may receive is, instead:
```
<tag attribute="&lt;value&gt;">2 + 3 &lt; 6</tag>
```
The call to gsub will transform the above to:
```
2 + 3 &lt; 6
```
And unescapeHTML will finish the job:
```
2 + 3 < 6
```

How do I fill a DataSet or a DataTable from a LINQ query resultset ?

How do you expose a LINQ query as an ASMX web service? Usually, from the business tier, I can return a typed DataSet or DataTable which can be serialized for transport over ASMX.

How can I do the same for a LINQ query? Is there a way to populate a typed DataSet or DataTable via a LINQ query?:

public static MyDataTable CallMySproc()    
{    
string conn = ...;

MyDatabaseDataContext db = new MyDatabaseDataContext(conn);    
MyDataTable dt = new MyDataTable();

// execute a sproc via LINQ
var query = from dr in db.MySproc().AsEnumerable
select dr;

// copy LINQ query resultset into a DataTable -this does not work !    
dt = query.CopyToDataTable();

return dt;
}

How can I get the resultset of a LINQ query into a DataSet or DataTable? Alternatively, is the LINQ query serializeable so that I can expose it as an ASMX web service?

From stackoverflow

If you use a return type of IEnumerable, you can return your query variable directly.
Could you create a class object and return a list(T) of the query?
As mentioned in the question, IEnumerable has a CopyToDataTable method:
```
IEnumerable<DataRow> query =
    from order in orders.AsEnumerable()
    where order.Field<DateTime>("OrderDate") > new DateTime(2001, 8, 1)
    select order;

// Create a table from the query.
DataTable boundTable = query.CopyToDataTable<DataRow>();
```
Why won't that work for you?

motto : To everyone wondering why CopyToDataTable() doesn't work on their machine: This function is not part of .NET 3.5 SP1 nor will it be of .NET 4.0; it has been restricted to IEnumerable and does not work for IEnumerable -- http://bit.ly/dL0G5
Make a set of Data Transfer Objects, a couple of mappers, and return that via the .asmx.
You should never expose the database objects directly, as a change in the procedure schema will propagate to the web service consumer without you noticing it.
Why would you want to do this? Datasets are heavyweight objects with an enormous payload when you serialize them (I'd guess this is true also for DataTables). Lars' suggestion is the way to go

To perform this query against a DataContext class, you'll need to do the following:

MyDataContext db = new MyDataContext();
IEnumerable<DataRow> query = (from order in db.Orders.AsEnumerable()
                              select new
                              {
                                 order.Property,
                                 order.Property2
                              }) as IEnumerable<DataRow>;
return query.CopyToDataTable<DataRow>();

Without the "as IEnumerable;" you will se the following compilation error:

Cannot implicitly convert type 'System.Collections.Generic.IEnumerable' to 'System.Collections.Generic.IEnumerable'. An explicit conversion exists (are you missing a cast?)

How to make not-null varchar columns to be required to be not empty in ASP.NET MVC validation?

So I'm using ASP.NET MVC RC1 and using the DefaultModelBinder's validation to show validation errors when not-null integer fields are not set to a value. This is done by default by MVC. However the same is not true for string(varchar) fields, because the binder is passed an empty string from the view, which is completely valid for a not-null varchar column.

In practice however when I mark a column as not-null, I pretty much always mean I want it to be non-empty. So I've come up with this way of adding that validation check to the default binder:

public class DefaultEntityBinder : DefaultModelBinder
{
 protected override bool OnPropertyValidating(ControllerContext controllerContext, ModelBindingContext bindingContext, System.ComponentModel.PropertyDescriptor propertyDescriptor, object value)
 {
  bool valid = base.OnPropertyValidating(controllerContext, bindingContext, propertyDescriptor, value);

  ColumnAttribute column = propertyDescriptor.Attributes.OfType<ColumnAttribute>().FirstOrDefault();
  if (column != null &&
   propertyDescriptor.PropertyType == typeof(string) &&
   !column.CanBeNull &&
   value.IsNullOrBlankString())
  {
   valid = false;
   bindingContext.ModelState[propertyDescriptor.Name].Errors.Add("A value is required");
  }
  return valid;
 }
}

I'm wondering if this is the most efficient way of achieving the goal of adding required string field validation to the default validation in MVC.

From stackoverflow

I have looked through the default model binding (in particular, TypeHelpers.TypeAllowsNullValue), and I don't see anything better for this. You could implement IDataErrorInfo on the type which contains the string, but you would have to do that on each type you support. So what you're doing is fine. If you're open to third-party code, the xVal validation framework is worth considering. Then you can use things like the attributes in System.ComponentModel to decorate non-nullable strings.

MySQL: Finding rows that don't take part in a relationship

I have two tables: 'movies' and 'users'. There's an n:m relationship between those, describing what movies a user has seen. This is described with a table 'seen' Now i want to find out for a given user, all the movies he has not seen. My current solution is like this:

SELECT *
FROM movies 
WHERE movies.id NOT IN (
     SELECT seen.movie_id 
     FROM seen 
     WHERE seen.user_id=123
)

This works fine but does not seem to scale very well. Is there a better approach to this?

From stackoverflow

Not only does your query work, it's the right approach to the problem as stated. Perhaps you can find a different way to approach the problem? A simple LIMIT on your outer select should be very fast even for large tables, for instance.
Seen is your join table, so yes, this looks like the correct solution. You are effectively "subtracting" the set of movie IDs in SEEN (for a user) from the totality in MOVIES, resulting in the unseen movies for that user.

This is called a "negative join", and sadly NOT IN or NOT EXISTS are the best options. (I would love to see a negative join syntax that was similar to INNER/OUTER/LEFT/RIGHT joins, but where the ON clause could be a subtraction statement).

@Bill's solution without a subquery should work, although as he noted it is a good idea to test your solution for performance both ways. I suspect that subquery or not, the entire SEEN.ID index (and of course the entire MOVIE.ID index) is going to be evaluated both ways: it will depend on how the optimizer handles it from there.
If your DBMS supports bitmap indexes, you could try them.

Bill Karwin : He tagged the question 'mysql'. MySQL does not support bitmap indexes.

abababa22 : Oops, I didn't look at the tag. :(
This works fine but does not seem to scale very well. Is there a better approach to this?

Did you try EXPLAIN on this query?
Here's a typical way to do this query without using the subquery method you showed. This may satisfy @Godeke's request to see a join-based solution.
```
SELECT * 
FROM movies m
 LEFT OUTER JOIN seen s
 ON (m.id = s.movie_id AND s.user_id = 123)
WHERE s.movie_id IS NULL;
```
However, in most brands of database this solution can perform worse than the subquery solution. It's best to use EXPLAIN to analyze both queries, to see which one will do better given your schema and data.

Here's another variation on the subquery solution:
```
SELECT * 
FROM movies m
WHERE NOT EXISTS (SELECT * FROM seen s 
                  WHERE s.movie_id = m.id 
                    AND s.user_id=123);
```
This is a correlated subquery, which must be evaluated for every row of the outer query. Usually this is expensive, and your original example query is better. On the other hand, in MySQL "NOT EXISTS" is often better than "column NOT IN (...)"

Again, you must test each solution and compare the results to be sure. It's a waste of time to choose any solution without measuring performance.

From a Query Window, can a Stored Procedure be opened into another Query Window?

Is there command inside a Query Window that will open a stored procedure in another Query Window?

i.e. MODIFY dbo.pCreateGarnishmentForEmployee

I am using SQL Server management Studio 2005 and Red Gate's SQL Prompt.

Currently I have to do the follwowing multiple steps:

Open Object Explorer Navigate Programmability | Stored Procedure Right Click the Stored Procedure name Select Modify

A Query Window will open up with the ALTER PROCEDURE.

As I mentioned above, what I would like to do is from a Query Window type in something to the effect of

MODIFY dbo.pCreateGarnishmentForEmployee

From stackoverflow

You are trying to mix two technologies here.
1. SQL and SQLSyntax
2. The SQL Management Tool
It is probably not possible to use TSQL to manipulate the Management Studio, which is what you appear to want. I suspect cut and paste is your only option.
I think that the only way that I'm aware of that produces an outcome similar to what you're asking for is running sp_helptext against your stored procedure name
```
sp_helptext 'dbo.pCreateGarnishmentForEmployee'
```
which will output the text as a resultset. Then click on the column header and copy/paste the resultset into the query window. You'll also need to change the
```
CREATE PROCEDURE ...
```
to
```
ALTER PROCEDURE ...
```
This method does not always produce the nicely formatted layout of your stored procedure however, so bear this in mind.

What criteria have you used to determine whether or not to implement a Software Factory?

I was discussing with a co-worker the other day the similarities of implementing a Software Factory for your development organization vs. using more of a scaffolding solution like active record. We both thought that implementing a Software factory may be considered by some to be a good idea when you have a larger group of developers and you want to maintain a certain level of consistency and convention within your code base.

Thinking about it a little more, I realized that I really like the idea of Software Factories for personal use, because they make it easier for me to code up the projects that I work on for fun as they save me a lot of headache in writing "boilerplate" code. That being said, I would bet that enforcing the use of a Software Factory in larger organizations might cause some strife within the team because some developers may think that it would be an infringement on their ability to be creative?

So what I'm wondering (from those of you who have been part of an organization that has implemented factories) is what criteria would it take to dictate the use of a factory within an organization, when the risk may be a bunch of unhappy developers?

From stackoverflow

In the spirit of SO, IMHO there is no organisation large or small that would get more out of a Software Factory compared to a good framework (Spring, Windsor, Active Record).

Software factories are only fun for those who build the factory, the factory analogy is very apt. Within an SF environment coding can become repetitive and boring, you're essentially telling your team, BTW you're actually too stupid to figure this out so we're going to make sure you can't make any mistakes. I know that sounds harsh, but that's how it pans out (and yes I was on the fun side of the equation when we tried it). Consistency and convention can be encoraged (I hate enforced) by all kinds of means, code review is best, but hard to do, code analysis (FxCop et al) are the worst but they do cover the basics.

Another problem with the SF approach is when the factory doesn't meet a particular need, then the coders are lost, they've been insulated from the underlying tech to such an extent that they have no conceptual model of what's going on. It's like asking a production line drone to build an engine, they don't know where to start. On the other hand a decent mechanic will know what to do (or where to go). You should be empowering your team with great tools, not restricting them with an unnecessary factory approach.

matt_dev : Sorry, I worded that question in-correctly, I'm not in favor of implementing factories in larger organizations. I was just wondering if anyone had done it what were their reasons? I actually agree with you, I think it would seriously discourage your team.

MrTelly : The core reason to go with the factory is that we'd decided to go MVC (clever) and then tried to use the Smart Client Software Factory. The title is an oxymoron, if ever there was one.
I suggest that it depends on what your factory is producing. There are certainly many examples of "cliche" code whose production can be automated, allowing the development teams to focus their creativity on the parts of the code that really take human intelligence.

It is also important (and entirely possible!) to organize the constructed artifacts with call-backs or other extension points; again the key is to make sure that the developers are able to see that your work is supporting them, rather than displacing them.

The above mean that you (and your teams) will be forced to do more up-front planning to achieve the coordination with a minimum of friction and "oh-we-forgot-that-we-also-need..." events. That will likely produce a bit of grumbling. However, if you do it right, you'll get to step up at some point and handle some external demand by tweaking the factory, with no interruption to their own work. At that point, more of them should begin to see you as an ally.

What would be the best option, performance wise, between 1 10k rpm disk and 2 7k rpm disks in striped raid

I'm thinking of improving the performance of my development machine, and the next step is the IO subsystem, namely, the hard disks. Assuming consumer grade disks (which removes SCSI and SAS drives) and a reasonable bill (which removes the option of two or more 10k RPM disks), the two options I'm faced are:

getting 1 VelociRaptor or equivalent 10k RPM SATA disk (most likely the 150gb one)
getting 2 standard 7k RPM disks and setting them up as RAID 0.

The 10 RPM disk is the safest but most expensive choice, as it costs about 150€ around here for the given size. Getting the 2 other disks gives a more dangerous scenario, because the likelihood of failure doubles, but it's a less expensive option, specially if I take care and do regular full disk backups.

My main question is then one of performance: which scenario would yield a better performance in development tasks (.NET mostly, some virtual machines running, Visual Studio). Has anyone seen or done comparative benchmarks between these two scenarios? Is there any scenario I'm missing?

EDIT: I'm now leaning towards the Velociraptor. As this is a development machine, the most common scenario would be compiling, and raid 0 would not help much with "mostly" random read/writes.

I intend to do some benchmarks before and after, and I'll update this question if/when I get the data. Thank you all for your answers.

From stackoverflow

Striping to multiple drives should be faster, if the controller is even close to decent (as in hardware supported, not some nasty software RAID card).

But this begs the question... have you really determined that disk I/O is your current bottleneck? If it's not then it doesn't matter anyway. ;-)
Option 2 would give you the bigger performance boost. If you're worried about drive failure then you should probably think about making regular backups of essential data.
I had a 150GB Raptor and then had to go back to "normal" disk. Now, the build process and Tomcat startup literary hurt. You notice improvement, but you notice slowdown much more. :)

Now I'm planning to go to VelociRaptor which should have even better performance than Raptor (and should be quieter/cooler as well).

As of two disks, yes I've heard too many horror stories, however the main reason I chose not to go that route is that two disks simply means more noise. VelociRaptor noise levels are great for silent computing.
I think a better question is not which provides better performance, but how MUCH performance benefit will you gain. Along the same lines, cost/performance increase is something to look at.

Looking it a few months ago, I believe I found that a two disk raid 0 had very meager improvements in speed and it's barely faster than a one disk setup of the same RPM.

I believe the 10K would be faster, but do your research and look at numbers.
As with any performance optimization exercise, you'll need to gather real data for real scenarios.
- Make sure you test the things you really do, not just running canned benchmarks. Benchmarks are great for comparing results from different runs, but they don't tell you enough about how things will affect what you really do.
- Measure speed now and with various changes.
- If you haven't already, max out your RAM.
- Drivers will probably have a big impact on your results. Make sure you have the latest drivers when you test speed. Sometimes you can get one driver from Windows Update, one from a device manufacturer, and one from the manufacturer of a chip used on a device. Be sure to try them all. Same with BIOS for all devices.
- Buy various options, try them out, and return the ones you don't use. I rarely buy electronics from a retail place, but if you have a local business where you get to know the staff and they're not a bunch of losers, they may help you experiment without paying a bunch of shipping & restocking fees.
- Do backups. Windows Home Server being the easy way. It lets you restore each time you change hardware, so you can quickly do real tests. It lets you recover from a bad RAID driver update. RAID 0 increases exposure to hardware failure, so it protects you from that, too.
Also, be sure to read the SO question "Best Dual HD Set up for Development"

Bruno Lopes : Thanks for reminding me to measure things. I'll most likely do so, both before and after a change. Hopefully I'll post the results somewhere
You need to determine what kind of operation is making your hard drive a bottleneck. If the issue is read latency, then the 10k RPM drive will be a bit better, but you should strongly consider one of Intel's SSDs, even though they are expensive. RAID 0 doesn't help with latency, because both drives still have to seek to the same part of the disk. RAID 1 would only help in smarter controllers that can distribute read requests across both disks.

If read or write bandwidth is your problem, then the two 7k RPM drives in RAID 0 will be better as long as your RAID controller is decent.

For software development, the performance bottleneck is usually compiling, which consists of reading and writing a lot of small files. Particularly if you're using something like parallel make, the disk will be getting a lot of random read requests, so latency will be the big issue. An Intel SSD could be several times faster than a 10k RPM drive for this.
either of these could be best, depending on usage.

other things equal, the 10k disk have around 1.5x bandwidth of a 7k2 one. RAID0 roughly doubles bandwidth, so it's 2x vs 1.5x, RAID0 wins on bandwidth.

OTOH, RAID0 does nothing to the seektime, while a 10k disk has (again) seek times around 1.5x better than the 7k2. 10k wins on seektimes.

so, the real question is: do you need more bandwidth, or lower seektimes? if you're doing heavy multitasking, the seek times are the killer. if it's mostly one app at a time, bandwidth is king.

in most cases, servers are latency-limited and workstations are bandwidth-limited. a few virtual machines don't change that, unless your VMs do heavy file-based IO on the background.
If you only care about speed then the verociraptor is a great drive but I'd suggest you consider the very latest 7.2k SATA disks, they're really very fast indeed these days and of course have a lot more capacity. Presumably SSD is out of the budget?
Seeks and small read/writes are the main behaviours exhibited during compilation.

RAID 0 is for seek behaviour slightly worse than a single disk.

A file will be split equally over the disks. If you wish to seek to that file, each drive has to seek. The heads will be in the same cylinder, but the rotational position will not be identical across the disks - it can't be, not at 7k RPM each with slightly different startups. Imagine you had a million disks - it would means some disks would be in the worst possible rotational point and you will have to wait the longest possible rotation time. Of course, generally, the cylinder to cyliner seek is far slower than the rotational seek, but if you don't have to change cylinder, then RAID 0 will be slower. At all other time, it will make no difference. RAID 0 will improve throughput, but you don't need that for compilation.

Your best bet actually is to buy some extra RAM and compile inside a RAM disk.
I’m late to the party on this one, but my Dad just sent me a link to this after I’d written a post on this exact topic.

In short: measure, but in my tests the RAID stripe won by a decent margin.

Different configuration files for development and production in ASP.NET application

On a project I'm working on we have a web application with three configuration files; Web.Config Web.Config.TestServer Web.Config.LiveServer

When we release to the test server, Web.Config is renamed to Web.Config.Development and Web.Config.TestServer is renamed to Web.Config, making that configuration active for the application.

It is quite onerous keeping three very similar configuration files up to date, and this system is used across a number of applications which are part of this project; not just the website.

The differences in configuration are most commonly local directories or paths, URLs, IPs, port numbers, and email addresses.

I'm looking for a better way.

From stackoverflow

If you have a db server in the mix, you can create a table that has the config, the property name, and the property value in it, then all you have to do is change one value in the web.config, the config name (dev, test, prod).

If you have different dbs for each config, then the only thing that's different is the connection string.
While your approach seems tedious, I find it to be the best approach.

I used to keep all of my configurations in a single web.config file, and simply had the "production" section commented out.

Shortly after this I had to do a "hybrid" test where my lookup data was coming from the production server, but the new data was being inserted into the test database. At that point I had to start piece-mealing what parts of the configuration block to comment/uncomment, and it became a nightmare.

Similarly, we have our server administrators do the actual migration from test to production, and most of them aren't fluent enough in .NET to know how to manage the web.config files. It is far easier for them to simply see a .test or .prod file and migrate the proper one up.

You could use something like a database to store all your configurations, but then you're running into another layer of abstraction and you have to manage that on top of things.

Once you get the knack or the template of how your two (or three) configuration files will be setup, it becomes a lot easier to manage them and you can have your test server configuration get modified for some unique testing without much hassle.

memory use in large data-structures manipulation/processing

I have a number of large (~100 Mb) files which I'm regularly processing. While I'm trying to delete unneeded data structures during processing, memory consumption is a bit too high. so, I was wondering is there a way to 'efficiently' manipulate large data, e.g.:

def read(self, filename):
    fc = read_100_mb_file(filename)
    self.process(fc)
def process(self, content):
    # do some processing of file content

is there a duplication of data structures? isn't it more memory efficient to use class-wide variable like self.fc?

how to garbage-collect? I know about gc module, but do I call it after i del fc for example? does garbage collector called after a del statement at all? when should I use garbage collection?

update
p.s. 100 Mb is not a problem in itself. but float conversion, further processing add significantly more to both working set and virtual size (i'm on windows).

From stackoverflow

Before you start tearing your hair out over the garbage collector, you might be able to avoid that 100mb hit of loading the entire file into memory by using a memory-mapped file object. See the mmap module.

SilentGhost : 100 Mb is just fine, problem starts when it hits 1.7 Gb of virtual memory

Crashworks : Yikes! That sounds more like you are hanging onto references to many things, so that the garbage collector can't clean them up. This can happen if you save off a reference to your intermediate data in the processing class.

SilentGhost : that's exactly my question!
Don't read the entire 100 meg file in at a time. Use streams to process a little bit at a time. Check out this blog post that talks about handling large csv and xml files. http://lethain.com/entry/2009/jan/22/handling-very-large-csv-and-xml-files-in-python/

Here is a sample of the code from the article.
```
from __future__ import with_statement # for python 2.5

with open('data.in','r') as fin:
    with open('data.out','w') as fout:
        for line in fin:
            fout.write(','.join(line.split(' ')))
```
SilentGhost : it doesn't seem to scale in terms of code, I don't need just to rearrange bits, there's more processing involved

Sam Corder : Once you have parsed a detail line and done your reduction calculations make sure you aren't hanging on to any of the objects created from parsing the details. Python GC is reference based. As long as there is a reference to an object it won't get GC'ed.

Sam Corder : Just to add. If you have two objects that refer to each other they will never be garbage collected unless one of them lets go of the reference to the other. Check for this kind of circular reference if you see your memory usage ballooning and thing you the objects should be out of scope.

Torsten Marek : @Sam Corder: Cyclic garbage collection has long since been added to Python.

Sam Corder : @Torsten Marek: Very cool. Thanks for the correction.
So, from your comments I assume that your file looks something like this:
```
item1,item2,item3,item4,item5,item6,item7,...,itemn
```
which you all reduce to a single value by repeated application of some combination function. As a solution, only read a single value at a time:
```
def read_values(f):
    buf = []
    while True:
        c = f.read(1)
        if c == ",":
            yield parse("".join(buf))
            buf = []
        elif c == "":
            yield parse("".join(buf))
            return
        else:
            buf.append(c)

with open("some_file", "r") as f:
     agg = initial
     for v in read_values(f):
         agg = combine(agg, v)
```
This way, memory consumption stays constant, unless agg grows in time.
1. Provide appropriate implementations of initial, parse and combine
2. Don't read the file byte-by-byte, but read in a fixed buffer, parse from the buffer and read more as you need it
3. This is basically what the builtin reduce function does, but I've used an explicit for loop here for clarity. Here's the same thing using reduce:
```
with open("some_file", "r") as f:
    agg = reduce(combine, read_values(f), initial)
```
I hope I interpreted your problem correctly.

SilentGhost : i'm sorry if I've put it clumsy, but by reduce I meant "make 32 Kb from 100 Mb"

Torsten Marek : No, I didn't mean that, I meant the reduce builtin.

J.F. Sebastian : I've added `reduce` example.

J.F. Sebastian : btw, `f.read()` should be `f.read(1)` in your code. And open("somefile", r) -> open("somefile", "r").

Torsten Marek : @J.F.: Ah, the joys of coding without testing. I've actually tried out the code and used f.read(1) there.

S.Lott : +1: Process incrementally
First of all, don't touch the garbage collector. That's not the problem, nor the solution.

It sounds like the real problem you're having is not with the file reading at all, but with the data structures that you're allocating as you process the files. Condering using del to remove structures that you no longer need during processing. Also, you might consider using marshal to dump some of the processed data to disk while you work through the next 100mb of input files.

For file reading, you have basically two options: unix-style files as streams, or memory mapped files. For streams-based files, the default python file object is already buffered, so the simplest code is also probably the most efficient:
```
  with open("filename", "r") as f:
    for line in f:
       # do something with a line of the files
```
Alternately, you can use f.read([size]) to read blocks of the file. However, usually you do this to gain CPU performance, by multithreading the processing part of your script, so that you can read and process at the same time. But it doesn't help with memory usage; in fact, it uses more memory.

The other option is mmap, which looks like this:
```
  with open("filename", "r+") as f:
    map = mmap.mmap(f.fileno(), 0)
    line = map.readline()
    while line != '':
       # process a line
       line = map.readline()
```
This sometimes outperforms streams, but it also won't improve memory usage.
I'd suggest looking at the presentation by David Beazley on using generators in Python. This technique allows you to handle a lot of data, and do complex processing, quickly and without blowing up your memory use. IMO, the trick isn't holding a huge amount of data in memory as efficiently as possible; the trick is avoiding loading a huge amount of data into memory at the same time.

Van Gale : Gah, as soon as I saw the question I jumped in to answer with link to the Beazley stuff and saw you give it as an answer already. Oh well, have to vote you up +1 instead! Just wish I could give it more than +1.
In your example code, data is being stored in the fc variable. If you don't keep a reference to fc around, your entire file contents will be removed from memory when the read method ends.

If they are not, then you are keeping a reference somewhere. Maybe the reference is being created in read_100_mb_file, maybe in process. If there is no reference, CPython implementation will deallocate it almost immediatelly.

There are some tools to help you find where this reference is, guppy, dowser, pysizer...

HTML DIV and IMG tag spacing in IE vs. FF

I'm having trouble with the layout of a simple HTML page. Please help.

Here's the layout I'm going for...

Layout

orange = body
blue/red = frame div
green = header image
black/white = menu div

It looks correct in Internet Explorer, but in Firefox, Safari, and Chrome there's a 4-pixel gap between my image (header) and my div (menu).

Internet Explorer, Firefox, Safari, and Chrome...

Browsers

This is my HTML...

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html>
    ...
    <body>
        <div id="frame">
            <img id="header" src="images/header.jpg" width="700" height="245" alt="" /><div id="menu">
                <strong>One</strong> &nbsp;|&nbsp;
                <a href="two.html">Two</a> &nbsp;|&nbsp;
                <a href="three.html">Three</a> &nbsp;|&nbsp;
                <a href="four.html">Four</a> &nbsp;|&nbsp;
                <a href="five.html">Five</a> &nbsp;|&nbsp;
                <a href="six.html">Six</a> &nbsp;|&nbsp;
                <a href="seven.html">Seven</a> &nbsp;|&nbsp;
                <a href="eight.html">Eight</a> &nbsp;|&nbsp;
                <a href="nine.html">Nine</a>
            </div>
            <div id="content">
                ...
            </div>
            ...
    </body>
</html>

Notice there's no whitespace between the IMG and the menu DIV.

This is my CSS...

...

div#frame {
    background: #FF0000;
    margin-right: auto;
    margin-left: auto;
    width: 700px;
    border: 5px #30AADE solid;
}

div#frame img#header {
    margin: 0;
    padding: 0;
    border: 0;
}

div#frame div#menu {
    margin: 0 auto 0 auto;
    padding: 5px 0 5px 0;
    border-top: solid 2px #FFFFFF;
    text-align: center;
    font-size: small;
    color: #88BE34;
    background-color: #000000;
}

div#frame div#menu strong {
    font-size: medium;
    color: #FFFFFF;
}

div#frame div#menu a {
    color: #88BE34;
}

Why are Firefox, Safari, and Chrome showing that 4-pixel gap?

From stackoverflow

It has to do with the default rules for IMG tags.

I recommend always using this in your stylesheet as a default rule:
```
img{
    display:block;
}
```
Zack Peterson : That's fixed it. Thank you.
My guess is it's the line height of the image-elements line, since IMG is a an inline element. You could probably fix it by giving img#header display: block;.

Anyways, what you should really do is remove the entire image and use a H1-element plus one of the many image replacement-techniques out there.

Edit: When that is said, your menu should also be marked up as an unordered list (UL).
In "standard" browsers (and in fact IE6 with the proper DOCTYPE!), your image is INLINE mode by default, so it gets spacing as if it was a letter sitting on the baseline of text.

freelookenstein gave the solution to remove the extra spaces due to text alignment of INLINE mode.

It is the solution, but I would be careful about using a display:block by default as most likely that will mess up your typical web page content down the line.

You could either add the display:block property to a class or inline style on your image alone.

Or something like this:
```
img { display:block; }
p img, ul img, td img /* etc*/ { display:inline; }
```
Personally I would recommend to limit display:block only to those images you know are used for the site layout, or those that are specifically inset in boxes. In which case often you have already a class on the parent element like:
```
<div class="imagebox">
   <img .... />
</div>

.imagebox img { display:block; }
```
You should wrap your menu links in an unordered list and then style the items with CSS. There are some reason for doing this:
1. Structuring your navigation links as a list results in more semantic markup. It better represents the content you are presenting. If you were to view the site with no CSS styles at all (you can do this with the Web Developer Toolbar for Firefox), you would still get a meaningful representation of your site layout. This is especially meaningful if you intend the site to be viewable by mobile browsers.
2. This may also (slightly) help search engines prioritize the content and boost your ranking.
3. You can define a style for your list items with a border on one side and some margins to get the "pipe delimited" effect. This will be reusable and makes it easier to change the menus to some other style in the future.
See A List Apart - CSS Design: Taming Lists

There is an example there showing complete CSS for achieving this effect.

Thursday, March 31, 2011

Blog Archive