-=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- (c) WidthPadding Industries 1987 0|684|0 -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=-
SoCoder -> Snippet Home -> Variables


 
HoboBen
Created : 21 April 2007
 
Language : Basic

String To Uppercase

Simple and light

For C - String to Uppercase

No need to define more string libraries, this does the job and keeps your programs portable (no non-ANSI libraries needed)


 

Comments


Sunday, 22 April 2007, 05:34
Jayenkai
Nice and simple. Especially like the 'A'-'a' bit at the end! Certainly as portable as it gets!!

.. Except that you're assuming a 20 character string!
Sunday, 22 April 2007, 06:44
mike_g
Nice, but would it not be simpler just to add 32 to the lowercase character?
Sunday, 22 April 2007, 10:36
HoboBen
@Jay - I forgot to solve that part. I'll edit soon. Cheers
@Mike - probably, I dunno I will try that though too
Sunday, 22 April 2007, 10:41
mike_g
Whoops converting from Lower to Upper case would be minusing 32 not adding it. Heres a little example C function I made to convert any string up to 1000 characters to uppercase:

It may be possible to make it faster by using bitwise operators. Since changing case is a matter of flicking the 32 bit of the Byte on or off. I don't know how to do that yet tho, and it may not be faster anyway.
Sunday, 22 April 2007, 11:28
Agent
Here's my suggestion...



Not adding/subtracting an offset to perform the conversion makes it independent of the character set and thus 100% portable.
toupper is usually implemented as a macro in the standard library.
Sunday, 22 April 2007, 11:31
HoboBen
Yes, and that's an ANSI standard library, however it's an extra #include though. Still, I'm just being fussy

Cheers, Mr. Smith
Sunday, 22 April 2007, 12:06
mike_g
Well its short and sweet agent smith. I guess this:
loops till the end of the string. It would be better than what I was doing.
Sunday, 22 April 2007, 20:17
Yayyak
It would probably be faster, and you've eliminated an extra variable (i).

Because C strings are null-terminated, this makes them really easy to iterate through in a loop, you just need to check if the current character is NULL.
Sunday, 22 April 2007, 21:02
Agent
Yep, although some optimizing compilers are probably smart enough to eliminate the redundant variable anyway and substitute pointer manipulation where array indexing is used.

I usually find that while loops execute faster than for loops (not only with C but most other compiled languages).

As for the issue of portability - you can't get much more portable than using the standard library. Here's what Dr. Plauger has to say about it (the discussion is for tolower, but the same thing applies to toupper):

tolower - "Use this function to force any uppercase letters to lowercase. It deals with such exotica as lowercase letters that have no corresponding uppercase letter and letters that have no case. Don't assume that you can convert an uppercase letter to its corresponding lowercase letter simply by adding or subtracting a constant value. This happens to be true for ASCII and EBCDIC, two popular character sets, but is not required by the C Standard."
Monday, 23 April 2007, 11:05
HoboBen
Cool!

Thanks Smith and all above. I've learnt a fair bit
Monday, 23 April 2007, 11:10
mike_g
Cool, I'll have to remember that Just for fun I knocked up a quick remake trying to shave off as many characters as I could:

Monday, 23 April 2007, 14:19
Agent
You could write:

instead of:


Also, you need two '&&'s for a logical 'And' because in C, one by itself is a bitwise 'And'.
Monday, 23 April 2007, 14:34
mike_g
*s -= 32;
Yeah thanks, I don't know why I dident figure that out in the first place. Tbh I only realized you could do this: += from hobobens code.

While the single & is for bitwise anding, I read somewhere that under certain circumstances it can also be used for logical comparisons. I don't have a clue about when this rule applies or not, but I tested the function and it worked fine so I kept is as it was.
Monday, 23 April 2007, 15:09
Agent
Haha, that's interesting. I guess it works because each expression yields a '1' and so when both expressions are true 1 & 1 = 1.

However, using a logical && may produce faster code because C expressions are evaluated from left to right, so if the first expression is false, the second won't be evaluated (whereas with a bitwise '&' both expressions must be evaluated each time).
Monday, 23 April 2007, 17:33
Yayyak
Optimising C code is a true art form, IMHO. And it's these little tricks that tutorials never teach you. Agent Smith, have you considered writing a tutorial with tricks to help properly optimise?
Tuesday, 24 April 2007, 02:11
power mousey
simplicity.



thank you,
power mousey
Tuesday, 24 April 2007, 09:44
Agent
Yayyak: Not really, but now that you mention it... I guess it's something I could probably have a go at some time. I won't claim to be an expert though.
Friday, 12 October 2007, 10:44
bajotumn
Here mike, this is a more efficient way of doing that, your for() loop is not needed; a do loop works for any string size without having to change anything, also heres a to lower case function to accompany the to upper case.

Friday, 12 October 2007, 16:17
mike_g
Yeah that first version I posted a while back was pretty lame. But tbh theres no real difference between using a for or while loop here, except maybe readability.

Just for fun I had a go at making my old function smaller and managed to shave an extra 3 characters off it:

Can anyone make it shorter?
Friday, 12 October 2007, 16:44
HoboBen
Bloody hell, Mike - that's compact!

I've been learning a fair bit of C recently for my Computing course - just realized how hideous my original code was!
Friday, 12 October 2007, 17:05
mike_g
just realized how hideous my original code was!

Same here dude.

I wouldent recommend seriously using that function I posted for anything tho. Agent Smith's version using the ctype toupper function would still be the best for reasons he mentioned.
Sunday, 14 October 2007, 07:57
Phoenix


Replacing char with int shaves off another character Not sure if that would work though - my C is kinda rusty.
Sunday, 14 October 2007, 08:01
mike_g
Nah that wouldent work because incrementing the 's' pointer now moves it by 4 bytes at a time instead of one, and *s would be reading 4 bytes from memory too. But I guess you could do:

Instead
Sunday, 14 October 2007, 08:10
Phoenix
Wouldn't that produce an error since it's not returning anything?
Sunday, 14 October 2007, 08:24
Jayenkai
The *s is a pointer, so it'll change the actual data, not a duplicate of it. No need to return anything.

(I think..)
Sunday, 14 October 2007, 08:24
mike_g
Not for me it dosent, but it should produce a warning.

|edit| I tested it in DevC++, but I think I remember getting errors for not returning a value in VC++ 2005. Or mybe its just my imagination. |edit|
Sunday, 14 October 2007, 08:27
Jayenkai
Ah, you meant the int at the start
Sunday, 14 October 2007, 10:06
spinal
I recently did this for lowercase...


Monday, 15 October 2007, 09:55
bajotumn
I do not recomend using toupper() or tolower() functions as they are in a windows only header, if you are going to make a standard ANSI C program and wish to release it on multiple platforms such as Linux or MacOS then you need to make the code compatable, this will work with all systems and is ANSI standard.

If you wanted to make it 1 line of code you can use bitwise manipulation.
C Representation: Reads as: Result:
letter |= 32;letter equals letter bitwise or 32D=d; d=d
letter &= 223;letter equals letter bitwise and 223d=D;D=D


Here are a few examples:
Upper to lower:
Lower to upper:
CodeOutput


If, however, your string has other characters in it such as punctuation you will need a short if() statement, such as the one I have written below.

Monday, 15 October 2007, 12:06
mike_g
I do not recomend using toupper() or tolower() functions as they are in a windows only header

Bollocks! Where did you get that idea from? If portability is an issue then use toupper()/tolower() as it supports character encoding other than asciii.

If you wanted to make it 1 line of code you can use bitwise manipulation.

That wont make the code any shorter. You still need to check if you are dealing with an alpha character or not first. Then |= or += is the same length anyway.
Monday, 15 October 2007, 16:09
Agent
bajotumn: toupper() and tolower() have been a part of the standard library since long before Windows appeared on the scene (or even Microsoft, for that matter). I have a copy of K&R (1st Edition) and they are described in it. This was published in 1978 at a time when C was being used exclusively for UNIX on machines such as the DEC PDP-11, IBM System/370, Honeywell 6000 and Interdata 8/32.

Recommended reading: "The Standard C Library" by P.J. Plauger.
Dr. Plauger participated in the ANSI Standards committees (even chaired some) and is regarded as one of the world's leading authorities. His book includes a complete source code implementation of the entire library.